Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Computer Science Searching Sorting Complexity and Performance Unit 14.

Similar presentations


Presentation on theme: "Introduction to Computer Science Searching Sorting Complexity and Performance Unit 14."— Presentation transcript:

1

2 Introduction to Computer Science Searching Sorting Complexity and Performance Unit 14

3 14- 2 Sorting and Searching We’ve learned about arrays (one dimensional and two dimensional) We’ve learned how to move through arrays, filling them up or printing them out Two of the most common operations on arrays are –Sorting: placing elements in a given order –Searching: finding where, or if, an element appears in an array

4 14- 3 Sorting and Searching This is a widely studied problem We’ll look at several different algorithms for carrying out sorting and searching Five different search methods, for sorted and unsorted arrays of values Three different sorting methods And later, a recursive sorting method!

5 14- 4 The Framework final int MAX = 10; int[ ] counts = new int[MAX]; void Initialize (int[ ] data) {…} //Loads data with test information int SearchMethod (int[ ] data, // what we search int num) // what we seek {…} //Returns position of num in data, or -1 if absent void SortMethod (int[ ] data) {…} //Sorts the array, if necessary.

6 14- 5 Five Different Search Methods search searches component-by-component through an unordered array. stateSearch is a state-oriented version of search linear searches component-by-component through an ordered array. quadratic is like linear, but takes bigger jumps binary uses a divide-and-conquer algorithm, and is the fastest of all

7 14- 6 Search through an unordered array The specification: Goal: Locate a value, or decide it isn’t there. Intentional bound: Spotting the value. Necessary bound: We’ve reached the last component. Plan: Advance to the next component.

8 14- 7 The search Java Code int search (int[ ] data, int num) { // Search method for an unordered array. // Return -1 for absent number. int pos = 0;// position of current component while ( (data[pos] != num) && (pos <(data.length - 1)) ) pos++; //Postcondition:if data[pos] isn’t num, no component is. if (data[pos] == num) return pos; else return -1; }// end of search note the ambiguous postcondition

9 14- 8 Similar Idea, but use a State-Oriented Approach We’ll use the states FOUND, ABSENT, and SEARCHING to control our loop and to let us know the status of the search. Goal: Locate a value, or decide it isn’t there. Bound: Our state is either ABSENT or FOUND Plan: Advance to the next component. Update the state.

10 14- 9 The stateSearch Java Code int stateSearch(int[ ] data, int num) { // State-oriented search of unordered array. Return -1 for absent num final int FOUND = 0, ABSENT = 1, SEARCHING = 2; int pos = 0, state = SEARCHING; do { // until we’re not searching anymore if (pos >= data.length) state = ABSENT; else if (data[pos] == num) state = FOUND; else pos++; } while ( state == SEARCHING) ; //Postcondition: if num was there, state’s FOUND and data[pos] is num switch (state) { case FOUND: return pos; case ABSENT: return -1; } }// end of stateSearch

11 A Version with a Bug So why don’t we do it this way? do { // until we’re not searching anymore if (data[pos] == num) state = FOUND; else if (pos >= data.length) state = ABSENT; else pos++; } while ( state == SEARCHING) ; Answer: if num isn’t in array, we’d eventually try to access a component beyond the end of the array.

12 Performance of the Search Algorithms The two algorithms we’ve looked at, for a problem of size n: –worst-case size of the search is n; we’d have to look through every value in the array –best-case size of the search is 1; we’d find the value in the first component of the array –average-case size of the search is n/2; the value might be anywhere, with equal probability

13 Searching Ordered Arrays When we had unordered arrays, we had to look through (potentially) every component to find the value we were looking for With ordered arrays, we can be more clever; we look at three techniques: –linear –quadratic –binary

14 Linear Search Very similar to the “search” method before, but stop the search if we find a component beyond the size of the one we are looking for Before, we had while ((data[pos] !=num) && (pos < data.length -1)) { Now, we have while ((data[pos] < num) && (pos < data.length -1)) { And it works, because the array was sorted before we began

15 The linear Java Code int linear (int[ ] data, int num) { // Linear search for num in an ordered array. // Return -1 for absent letter. int pos = 0;// position of current component while ( (data[pos]< num) && (pos < (data.length - 1)) ) pos++; //Postcondition:if data[pos] isn’t num, no component is. if (data[pos] == num) return pos; else return -1; }// end of linear

16 Quadratic Search What if we were able to improve linear search by –Taking big jumps to get close to the value we’re looking for –Take small steps to locate it exactly

17 Big Jumps, Little Jumps What is the most effective relationship between big jumps and small steps? If the big jumps are too big, too many small steps will be required; if the big jumps are too small, we’ll have to make too many of them Big jump too big Big jump too small

18 Some Sample Figures (1)101,0001,000,0000 (n/10) ,999 (n/log 2 n) ,999 (sqrt n)3321, Big Jump Size Maximum number of big jumps (n = 10, 1,000, 1,000,000) Maximum number of single steps (n = 1,000,000)

19 Moral of the Story Making the big jumps too big or too small doesn’t help us much We do best when we make roughly equal the number of big jumps and the number of small steps for the worst case The quadratic search algorithm is based on a jump size that equals the square root of the number of components to search

20 The Pseudocode Outline of Quadratic Search calculate the step size; do { update states and position } while (state != CLOSEENOUGH); // end of big jump loop // Postcondition: if the value is there, another big step would // go past it. do { update states and position } while (state != SEARCHING); // end of single step loop // Postcondition: if state == FOUND then // data[position] == num set position to -1 if state equals ABSENT

21 int quadratic(int[ ] data, int num) { // Quadratic search through an ordered array. Return -1 for absent num final int FOUND=0, ABSENT=1, SEARCHING=2, CLOSEENOUGH=3; int state = SEARCHING, position = 0, jumpSize; jumpSize = (int) (Math.sqrt(data.length)); do { // by big jumps until we’re close enough if ( (position + jumpSize) >= data.length) state = CLOSEENOUGH; else if (data[position + jumpSize] > num ) state = CLOSEENOUGH; else position = position + jumpSize; } while (state != CLOSEENOUGH); // Postcondition: if num is there, data[position] <= num state = SEARCHING; // reset the current state do { // by single steps until we’re not searching if (position >= data.length) state = ABSENT; else if ( data[position] > num ) state = ABSENT; else if ( data[position] == num ) state = FOUND; else position++; // state is unchanged } while (state == SEARCHING) ; //Postcond.:if num's there, state’s FOUND and data[position] is num if (state == ABSENT) return -1; else return position; } // quadratic

22 What’s our analysis of Quadratic search? The best case is easy: 1 step The worst case is about twice the square root of n (one less than the square root of n big steps, one less than the square root of n small steps) The average case about equals the square root of n (about 1/2 square root of n big steps, and about 1/2 square root of n small steps)

23 Binary Search How do we improve on quadratic search? By making the jump size variable and dynamic The jumps start big, then get smaller and smaller The first jump is half the size of the array, the second is 1/4, the third is 1/8, the fourth is 1/16, … This is the classic divide and conquer algorithm

24 Data Bounds that Bracket the Unknown Area of the Array At the start, all of the stored values are unknown: ????????????? look here first ????>>>>>>>?? still unknowndefinitely too high look here second >>>>>><< still unknown definitely too high definitely too low look here next

25 Termination When the lower and upper bounds of the unknown area pass each other, the unknown area is empty and we terminate (unless we’ve already found the value) Goal: Locate a value, or decide it isn’t there Intentional Bound: We’ve found the value Necessary Bound: The lower and upper bounds of our search pass each other Plan: Pick a component midway between the upper and lower bounds. Reset the lower or upper bound, as appropriate.

26 The Binary Search Pseudocode initialize lower and upper to the lower and upper array bounds; do { let middle equal (lower plus upper) / 2; if the value of data[middle] is low make middle (plus 1) the new lower bound else make middle (minus 1) the new upper bound } while we find the value or run out of unknown data; decide why we left the loop, and return an appropriate position

27 The Binary Search Java Code int binary (int[ ] data, int num) { // Binary search for num in an ordered array int middle, lower = 0, upper = (data.length - 1); do { middle = ((lower + upper) / 2); if (num < data[middle]) upper = middle - 1; else lower = middle + 1; } while ( (data[middle] != num) && (lower <= upper) ); //Postcondition: if data[middle] isn’t num, no // component is if (data[middle] == num) return middle; else return -1; } // binary

28 Subtle Boundary Conditions Why do we write "<=" here, and not "<"? do { middle = ((lower + upper) / 2); if (num < data[middle]) upper = middle - 1; else lower = middle + 1; } while ( (data[middle] != num) && (lower <= upper) );

29 What Happens When lower equals upper? <<>>>>>><< lower upper do { middle = ((lower + upper) / 2); if (num < data[middle]) upper = middle - 1; else lower = middle + 1; } while ( (data[middle] != num) && (lower <= upper) ); middle is here or there We need one more loop to set middle to the ? spot, and see whether data[middle] == num

30 By the way, what happens when lower touches upper? <<>>>>><< lower upper middle is here or there The next loop sets middle equal to lower, and lower equal to upper (i.e., like the previous slide) do { middle = ((lower + upper) / 2); if (num < data[middle]) upper = middle - 1; else lower = middle + 1; } while ( (data[middle] != num) && (lower <= upper) );

31 What’s our analysis of Binary search? The best case is easy: 1 step The worst case is the log 2 n (how many times can n be divided in half before we’re left with an array of length 1? Starting with 1, how many times can you double a value until it’s as large as n?) The average case requires a more detailed analysis

32 What’s really going on with average case size Assume, first of all, that what we are searching for is in the array (if not, of course, average case of the search might be affected by how often the item is not in the array) In our searching algorithms, the average case size can be thought of as the sum where p i is the probability of finding the item at a given step, and d i is the “amount of work” to reach that step.

33 Simple Example Given: a 3-element array, 90% chance of finding what we want in the first cell, 5% in the second, 5% in the third We search linearly What’s the expected average amount of work to find what we are looking for? 90%5% (.9 * 1) + (.05 * 2) + (.05 * 3) = 1.15 steps on average

34 Simple Example Given: a 3-element array, 90% chance of finding what we want in the first cell, 5% in the second, 5% in the third We search linearly What’s the expected average amount of work to find what we are looking for? 90%5% (.9 * 1) + (.05 * 2) + (.05 * 3) = 1.15 steps on average Probability of finding it after 1 stepProbability of finding it after 2 steps Probability of finding it after 3 steps

35 Average Case Size of search and stateSearch Algorithms We said “average-case size of the search is n/2; the value might be anywhere, with equal probability” In other words, our search might be (1 * [1/n]) + (2 * [1/n]) + (3 * [1/n]) +… + (n * [1/n]) in other words, 1/n * in other words, 1/n * ([n * (n + 1)] / 2) in other words, n/2 + ½

36 Constants Fade in Importance We will see soon that if the expected amount of work is: n/2 + ½ what really interests us is the “shape” of the function The ½ fades away for large n Even the division of n by 2 is basically unimportant (since we didn’t really quantify how much work each cell of the array took) What’s important is that the expected work grows linearly with the size of the array (i.e., the input)

37 Asymptotic Behavior of (Some) Functions

38 Average Case Size of Binary Search Algorithm There is 1 element we can get to in one step… ????????????? ?? Average work = (1*(1/n))… Probability of finding it after 1 step

39 Average Case Size of Binary Search Algorithm There is 1 element we can get to in one step, 2 that we can get to in two steps… ????????????? ?? Average work = (1*(1/n)) + (2*(2/n))… Probability of finding it after 2 steps

40 Average Case Size of Binary Search Algorithm There is 1 element we can get to in one step, 2 that we can get to in two steps, 4 that we can get to in three steps… ????????????? ?? Average work = (1*(1/n)) + (2*(2/n)) + (3*(4/n))… Probability of finding it after 3 steps

41 Average Case Size of Binary Search Algorithm There is 1 element we can get to in one step, 2 that we can get to in two steps, 4 that we can get to in three steps, 8 that we can get to in four steps… ????????????? ?? Average work = (1*(1/n)) + (2*(2/n)) + (3*(4/n)) + (4*(8/n))… Probability of finding it after 4 steps

42 Average Case Size of Binary Search Algorithm There is 1 element we can get to in one step, 2 that we can get to in two steps, 4 that we can get to in three steps, 8 that we can get to in four steps… In other words, there is 1/n chance of 1 step, 2/n chance of 2 steps, 4/n chance of 3 steps, 8/n chance of 4 steps… ????????????? ??

43 Average Case Size of Binary Search Algorithm In other words, we have 1/n * [(1*1)+(2*2)+(4*3)+(8*4)+…+(n/2 *log 2 n)] In other words, we have For large n, this converges to (log 2 n) - 1, so that’s our average case size for binary search 1 2 i-1  i n i=1   log 2 n ] [

44 Again, Constants Fade in Importance If we calculated that the average case for binary search is (log 2 n) – 1 We don’t care how much work we did in each cell What’s really interesting is the “shape” of the function for large inputs (i.e., large n) So we would say that the average case for binary search is O(log 2 n) – read “Big-O” of log 2 n We’ll define this precisely soon

45 Asymptotic Behavior of (Some) Functions

46 Sorting Sorting data is done so that subsequent searching will be much easier void Initialize (int[ ] data) {…} //Loads data with test information int SearchMethod (int[ ] data, // what we search int num) // what we seek {…} //Returns position of num in data, or -1 if absent void SortMethod (int[ ] data) {…} //Sorts the array, if necessary.

47 Three Different Algorithms for Sorting Select is based on selection sorting Insert is based on insertion sorting Bubble is based on bubble sorting Assume in our examples that the desired order is largest to smallest starting order:

48 Selection Sort starting order: search through array, find largest value, exchange with first array value: search through rest of array, find second-largest value, exchange with second array value:

49 Continue the Select and Exchange Process search through rest of array, one less each time:

50 Selection Sort Pseudocode for every “first” component in the array find the largest component in the array; exchange it with the “first” component

51 Insertion Sort starting order: move through the array, keeping the left side ordered; when we find the 35, we have to slide the 18 over to make room: continue moving through the array, always keeping the left side ordered, and sliding values over as necessary to do so: 18 slid over

52 Continue the Insertion Process the left side of the array is always sorted, but may require one or more components to be slid over to make room: 35, 22, and 18 slid over , 22, and 18 slid over , 22, and 18 slid over

53 Continue the Insertion Process , 35, 22, and 18 slid over nothing slides over , 22, 18, and 10 slid over

54 Insertion Sort Pseudocode for every “newest” component remaining in the array temporarily remove it; find its proper place in the sorted part of the array; slide smaller values one component to the right; insert the “newest” component into its new position;

55 Bubble Sort starting order: compare the first two values; if the second is larger, exchange them: next, compare the second and third values, exchanging them if necessary:

56 Much Ado About Nothing The comparison continues, third and fourth, fourth and fifth, etc., with exchanges occurring when necessary. In the end, the smallest value has “bubbled” its way to the far right—but the rest of the array still isn’t ordered:

57 Continue the bubbling Next, go back to beginning, and do the same thing, comparing and exchanging values (except for the last) The second smallest value has now bubbled to the right. Do the same from the beginning, but ignoring the last two values:

58 Bubble Sort Pseudocode for every “last” component for every component from the first to the “last” compare that component to each remaining component; exchange them if necessary;

59 Each Method’s Advantages Selection sort is simple because it requires only two-value exchanges Insertion sort minimizes unnecessary travel through the array. If the values are sorted to begin with, a single trip through the array establishes that fact (selection sort requires the same number of trips no matter how organized the array is) Bubble sort requires much more work, but…well,…uh,…it’s the easiest one to code!

60 Stable Sorting vs. Unstable Sorting Techniques An array might include elements with exactly the same "sorting value" (e.g., objects are in the array, and we're sorting on some attribute) Sorts that leave such components in order are called stable, while sorts that may change order are called unstable

61 The Selection Sort Java Code void select (int[ ] data) { // Uses selection sort to order an array of integers. int first, current, largest, temp; for (first = 0; first < data.length - 1; first++) { largest = first; for (current = first + 1; current < data.length; current++) { if ( data[current] > data[largest] ) largest = current; } // Postcondition: largest is index of largest item // from first..end of array if (largest != first) { // We have to make a swap temp = data[largest]; data[largest] = data[first]; // Make the swap data[first] = temp; } } // select

62 The Insertion Sort Java Code void insert (int[ ] data) { // Uses insertion sort to order an array of integers. int newest, current, newItem; boolean seeking; for (newest = 1; newest < data.length; newest++) { seeking = true; current = newest; newItem = data[newest]; while (seeking) { // seeking newItem's new position on left if (data[current - 1] < newItem) { data[current] = data[current -1]; //slide value right current--; seeking = (current > 0); } else seeking = false; } // while // Postcondition: newItem belongs in data[current] data[current] = newItem; } // newest for } // insert

63 The Bubble Sort Java Code void bubble (int[ ] data) { // Uses bubble sort to order an array of integers. int last, current, temp; for (last = data.length-1; last > 0; last--) { for (current = 0; current < last; current++) { if ( data[current] < data[current + 1] ) { temp = data[current]; data[current] = data[current + 1]; data[current + 1] = temp; } // if } // current for //Postcondition: Components last through the end of // the array are ordered. } // last for } // bubble

64 Experimental Comparison How do the three methods do with the array having this content? Selection sort: 36 comparisons, 7 swaps Insertion sort: 25 comparisons, 19 swaps Bubble sort: 36 comparisons, 19 swaps

65 The Netherlands' Flag

66 Dijkstra’s Dutch National Flag Suppose that an array of length N holds three different values: red, white, and blue. Write a program that puts all the red values at the left end of the array, the blue ones at the right end, and the white values in the middle RRWBRWBRWBRWB

67 Dijkstra’s Dutch National Flag Suppose that an array of length N holds three different values: red, white, and blue. Write a program that puts all the red values at the left end of the array, the blue ones at the right end, and the white values in the middle RRRWWWWBBBBRR

68 There’s more than one way to skin a cat The trivial solution: travel through the array, figure out how much space you need for each group, set an index to the beginning of the appropriate region, then travel through the array, moving contents to the correct space (updating each region’s indexes) That’s not allowed. Try solving it with one pass through the array. RRWBRWBRWBRWB red region will start here white region will start here blue region will start here

69 Why we’re doing this The solution provides a good example of using a bunch of our previous techniques: –subscripts into an array as indexes –case analysis to guide the decision- making in the algorithm –using data bounds to decide when to leave a loop –also uses some ideas from our “sorting/searching” programs

70 A Simpler Problem Let’s say we have only two colors, red and white. We can sort the array into two regions (red at left, white at right), with the unknown region in the middle Two variables act as subscript pointers, separating the known from the unknown part of the array RRR?????WWWRR start of the unknown— first last —end of the unknown

71 The Case Analysis We start with the whole array unknown ????????????? start of the unknown— first last —end of the unknown While the first element is red, advance first RR????????RR start of the unknown— first last —end of the unknown W

72 The Case Analysis (II) While the last element is white, decrease last RR????WWWRR start of the unknown— First Last —end of the unknown If first is W and last is R, then switch them RR W ???? R WWWRR start of the unknown— First Last —end of the unknown WR

73 We Use a Data Bound The loop is over when the unknown portion is empty. RRRRRWWWWRR start of the unknown— first last —end of the unknown RW

74 The Pseudocode for the two-color flag problem initialize first and last to array’s beginning and end; do { if the current component flag(first) is red advance the first pointer else if the current component flag(last) is white decrease the last pointer else { swap flag(first) with flag(last) advance first; decrease last; } // end the last else part } while ( first and last haven’t passed each other ) // Postcondition: there are no more components to check.

75 Didn’t I say there’s more than one way to skin a cat? We don’t have to put the white components at the end, we can put them in the middle the mystery component RR??????WWWRR redBorder whiteBorder The rule: if the next unknown component is white, advance the white border; if it is red, swap it with the leftmost white component and advance both borders.

76 So…the sliding effect is accomplished with a simple switch If the mystery component was white: RRW?????WWWRR redBorder whiteBorder If the mystery component was red: RRW????RWWRR redBorder whiteBorder the mystery component: RR??????WWWRR redBorder whiteBorder W

77 Where do redBorder and whiteBorder really point? redBorder points one component behind the first known white value; if there are no known whites, it points to “-1” (before the array) whiteBorder points one component ahead of the last known white value; if there are no known whites, it points one ahead of the last red component Our transitions maintain these relationships

78 The specification (algorithm 2, two-color flag problem) Goal: Separate an array’s red and white components Bound: Reaching the last array component Plan: If the current whiteBorder component is white, advance the whiteBorder If the current component is red, advance the redBorder; swap the value at redBorder for the value at whiteBorder; advance the whiteBorder

79 Now we’ve got a count bound intialize redBorder to -1; initialize whiteBorder to 0; do { switch (the current whiteBorder component) { case white: increment whiteBorder; case red: increment redBorder; swap flag[redBorder] with flag[whiteBorder]; increment whiteBorder; } // end switch } while whiteBorder hasn’t gone past the end of the array

80 Putting it all together In the second algorithm, the right hand side of the array is free So we can use that right-hand side to hold the blue components in the “three-color flag” problem RR????BBWWWRR redBorder whiteBorder blueBorder

81 Now we have a data bound (again) intialize redBorder to -1; // before the first known white initialize whiteBorder to 0; // the first remaining unknown initialize blueBorder to array limit plus 1; //after last unknown do { switch (the current whiteBorder component) { case white: increment whiteBorder case red: increment redBorder; swap flag[redBorder] with flag[whiteBorder]; increment whiteBorder; case blue: decrement blueBorder; swap flag[blueBorder] with flag[whiteBorder]; } // end switch } while (whiteBorder hasn't passed blueBorder) ;

82 Complexity and Performance Some algorithms are better than others for solving the same problem We can’t just measure run-time, because the number will vary depending on –what language was used to implement the algorithm, how well the program was written –how fast the computer is –how good the compiler is –how fast the hard disk was…

83 Basic idea: counting operations Each algorithm performs a sequence of basic operations: –Arithmetic: (low + high)/2 –Comparison: if ( x > 0 ) … –Assignment: temp = x –Looping: while ( true ) { … } –… Idea: count the number of basic operations performed on the input

84 It Depends Difficulties: –Which operations are basic? –Not all operations take the same amount of time –Operations take different times with different hardware or compilers

85 Sample running times of basic Java operations (ran them in a loop…) Loop Overhead ; Double division d = 1.0 / d; Method call o.m(); Object Constructiono=new SimpleObject(); Sys1: PII, 333MHz, jdk1.1.8, -nojit Sys2: PIII, 500MHz, jdk1.3.1 Operation Loop Body nSec/iteration Sys1 Sys2

86 So instead… We use mathematical functions that estimate or bound: –the growth rate of a problem’s difficulty, or –the performance of an algorithm Our Motivation: analyze the running time of an algorithm as a function of only simple parameters of the input

87 Asymptotic Running Factors Operation counts are only problematic in terms of constant factors. The general form of the function describing the running time is invariant over hardware, languages or compilers! public static int myMethod(int n) { int sq = 0; for(int j=0; j < n ; j++) for(int k=0; k < n ; k++) sq++; return sq; } Running time is “about” n 2 We use “Big-O” notation, and say that the running time is O(n 2 )

88 The Problem’s Size The problem’s size is stated in terms of n, which might be: –the number of components in an array –the number of items in a file –the number of pixels on a screen –the amount of output the program is expected to produce

89 Example Linear growth in complexity (searching an array, one component after another, to find an element): n number of components time to perform the search

90 Another example Polynomial growth: the quadratic growth of the problem of visiting each pixel on the screen, where n is the length of a side of the screen: n length of the side of the screen time to visit all pixels n2n2 polynomial

91 Does this matter? Yes!!! Even though computers get faster at an alarming rate, the time complexity of an algorithm still has a great affect on what can be solved Consider 5 algorithms, with time complexity –log 2 N –N –N log 2 N –N 2 –2 N

92 Asymptotic Behavior of (Some) Functions

93 Some Numbers Consider 5 algorithms, with time complexity –n –n log 2 n –n 2 –n 3 –2 n

94 Limits on problem size as determined by growth rate AlgorithmTimeMaximum problem size Complexity1 sec 1 min1 hour A 1 n x x 10 6 A 2 n log 2 n x 10 5 A 3 n A 4 n A 5 2 n Assuming one unit of time equals one millisecond.

95 Effect of tenfold speed-up AlgorithmTimeMaximum problem size Complexity before speed-upafter speed-up A 1 n s 1 10s 1 A 2 n log 2 n s 2 Approx. 10s 2 (for large s 2 ) A 3 n 2 s s 3 A 4 n 3 s s 4 A 5 2 n s 5 s

96 Functions as Approximations formnamemeaning for very big n Task (n) =  (f(n))‘omega’f(n) is underestimate or lower bound Task (n) = ~(f(n))‘tilde’f(n) is almost exactly correct Task (n) = O(f(n))‘big O’f(n) is an overestimate or upper bound Task (n) = o(f(n))‘little o’f(n) increasingly overestimates

97 Big O Notation Big O notation is the most useful for us; it says that a function f(n) serves as an upper bound on real-life performance. For algorithm A of size n (informally): The complexity of A(n) is on the order of f(n) if A(n) is less than or equal to some constant times f(n) The constant can be anything as long as the relation holds once n reaches some threshold.

98 Big O Notation A( n ) is O(f( n )) as n increases without limit if there are constants C and k such that A( n )  C*f( n ) for every n > k This is useful because is focuses on growth rates. An algorithm with complexity n, one with complexity 10 n, and one with complexity 13 n + 73, all have the same growth rate. As n doubles, cost doubles. (We ignore the “73”, because we can increase 13 to 14, i.e., 14 n  13 n + 73 for all n  73.)

99 Big O Notation This is a mathematically formal way of ignoring constant factors, and looking only at the “shape” of the function A=O(f) should be considered as saying that “A is at most f, up to constant factors” We usually will have A be the running time of an algorithm and f a nicely written function. E.g. The running time of the algorithm on the right is O(n 2 ) public static int myMethod(int n) { int sq = 0; for(int j=0; j < n ; j++) for(int k=0; k < n ; k++) sq++; return sq; }

100 Asymptotic Analysis of Algorithms We usually embark on an asymptotic worst case analysis of the running time of the algorithm. Asymptotic: –Formal, exact, depends only on the algorithm –Ignores constants –Applicable mostly for large input sizes Worst Case: –Bounds on running time must hold for all inputs –Thus the analysis considers the worst-case input –Sometimes the “average” performance can be much better –Real-life inputs rarely “average” in any formal sense

101 Worst Case/Best Case Worst case performance measure of an algorithm states an upper bound Best case complexity measure of a problem states a lower bound; no algorithm can take less time

102 Multiplicative Factors Because of multiplicative factors, it’s not always clear that an algorithm with a slower growth rate is better If the real time complexities were A 1 = 1000n, A 2 = 100nlog 2 n, A 3 = 10n 2, A 4 = n 3, and A 5 = 2 n, then A 5 is best for problems with n between 2 and 9, A 3 is best for problems with n between 10 and 58, A 2 is best for n between 59 and 1024, and A 1 is best for bigger n.

103 An Example: Binary Search Binary search splits the unknown portion of the array in half; the worst-case search will be O(log 2 n) Doubling n only increases the logarithm by 1; growth is very slow

104 Example: Insertion Sort (reminder) starting order: move through the array, keeping the left side ordered; when we find the 35, we have to slide the 18 over to make room: continue moving through the array, always keeping the left side ordered, and sliding values over as necessary to do so: 18 slid over

105 Example: Insertion Sort Sum from 1 to N would be: (N*(N+1))/2 So sum from 1 to N-1 is ((N-1)*N)/2 Worst case for insertion sort is thus N 2 /2 – N/2 In other words, O(N 2 )

106 Asymptotic Behavior of (Some) Functions N grows much slower than N 2, so we ignore the N term

107 Selection Sort and Bubble Sort Similar analyses tell us that both Selection Sort and Bubble Sort have time complexity of O(N 2 ) Selection Sort Bubble Sort

108 Some Complexity Examples For each of the following examples: 1.What task does the function perform? 2.What is the time complexity of the function? 3.Write a function which performs the same task but which is an order-of- magnitude (not a constant factor) improvement in time complexity

109 Example 1 public int someMethod1 (int[] a) { int temp = 0; for (int i=0; i temp) temp = Math.abs(a[j]-a[i]); return temp; }

110 Example 1 1.Finds maximum difference between two values in the array public int someMethod1 (int[] a) { int temp = 0; for (int i=0; i temp) temp = Math.abs(a[j]-a[i]); return temp; }

111 Example 1 1.Finds maximum difference between two values in the array 2.O(n 2 ) public int someMethod1 (int[] a) { int temp = 0; for (int i=0; i temp) temp = Math.abs(a[j]-a[i]); return temp; }

112 Example 1 public int someMethod1 (int[] a) { int temp = 0; for (int i=0; i temp) temp = Math.abs(a[j]-a[i]); return temp; } 1.Finds maximum difference between two values in the array 2.O(n 2 ) 3.Find the max and min values in the array, then subtract one from the other – the problem will be solved in O(n)

113 Example 2: a[ ] is sorted in increasing order, b[ ] is not sorted public boolean someMethod2(int[] a, int[] b) { for (int j=0; j < b.length; j++) for (int i=0; i < a.length-1; i++) if (b[j] == a[i] + a[i+1]) return true; return false; }

114 Example 2: a[ ] is sorted in increasing order, b[ ] is not sorted 1.Checks whether a value in b[] equals the sum of two consecutive values in a[] public boolean someMethod2(int[] a, int[] b) { for (int j=0; j < b.length; j++) for (int i=0; i < a.length-1; i++) if (b[j] == a[i] + a[i+1]) return true; return false; }

115 Example 2: a[ ] is sorted in increasing order, b[ ] is not sorted 1.Checks whether a value in b[] equals the sum of two consecutive values in a[] 2.O(n 2 ) public boolean someMethod2(int[] a, int[] b) { for (int j=0; j < b.length; j++) for (int i=0; i < a.length-1; i++) if (b[j] == a[i] + a[i+1]) return true; return false; }

116 Example 2: a[ ] is sorted in increasing order, b[ ] is not sorted 1.Checks whether a value in b[] equals the sum of two consecutive values in a[] 2.O(n 2 ) 3.For each value in b[], carry out a variation of a binary search in a[] – the problem will be solved in O(n log n) public boolean someMethod2(int[] a, int[] b) { for (int j=0; j < b.length; j++) for (int i=0; i < a.length-1; i++) if (b[j] == a[i] + a[i+1]) return true; return false; }

117 Example 3: each element of a[ ] is a unique int between 1 and n; a.length is n-1 public int someMethod3 (int[] a) { boolean flag; for (int j=1; j<=a.length+1; j++) { flag = false; for (int i=0; i

118 Example 3: each element of a[ ] is a unique int between 1 and n; a.length is n-1 public int someMethod3 (int[] a) { boolean flag; for (int j=1; j<=a.length+1; j++) { flag = false; for (int i=0; i

119 Example 3: each element of a[ ] is a unique int between 1 and n; a.length is n-1 public int someMethod3 (int[] a) { boolean flag; for (int j=1; j<=a.length+1; j++) { flag = false; for (int i=0; i

120 A better solution to Example 3 public int betterMethod3 (int[] a) { int result = 0, sum, n; n = a.length+1; sum = (n*(n+1))/2; for (int i=0; i < a.length; i++) result += a[i]; return (sum-result); }

121 A better solution to Example 3 Time complexity O(n) public int betterMethod3 (int[] a) { int result = 0, sum, n; n = a.length+1; sum = (n*(n+1))/2; for (int i=0; i < a.length; i++) result += a[i]; return (sum-result); }

122 Theoretical Computer Science Studies the complexity of problems: –increasing the theoretical lower bound on the complexity of a problem –determining the worst-case and average- case complexity of a problem (along with best-case) –showing that a problem falls into a given complexity class (e.g., requires at least, or no more than, polynomial time)

123 Easy and Hard problems “Easy” problems, by convention, are those that can be solved in polynomial time or less “Hard” problems have only non- polynomial solutions: exponential or worse Showing that a problem is easy is easy (come up with an “easy” algorithm); proving a problem is hard is hard

124 Theory and algorithms Theoretical computer scientists also –devise algorithms that take advantage of different kinds of computer hardware, like parallel processors –devise probabilistic algorithms that have very good average-case performance (though worst-case performance might be very bad) –narrow the gap between the inherent complexity of a problem and the best currently known algorithm for solving it


Download ppt "Introduction to Computer Science Searching Sorting Complexity and Performance Unit 14."

Similar presentations


Ads by Google