Presentation is loading. Please wait.

Presentation is loading. Please wait.

Week 13: Searching and Sorting

Similar presentations


Presentation on theme: "Week 13: Searching and Sorting"— Presentation transcript:

1 Week 13: Searching and Sorting
CS 177

2 Searching for a number Lets say that I give you a list of numbers, and I ask you, “Is 37 on this list?” As a human, you have no problem answering this question, as long as the list is reasonably short What if the list is an array, and I want you to write a Java program to find some number?

3 Search algorithm Easy! We just look through every element in the array until we find it or run out If we find it, we return the index, otherwise we return -1 public static int find( int[] array, int number ) { for( int i = 0; i < array.length; i++ ) if( array[i] == number ) return i; return -1; }

4 How long does it take? We talked about Big Oh notation last week
Now we have some way to measure how long this algorithm takes How long, if n is the length of the array? O(n) time because we have to look through every element in the array, in the worst case

5 Can we do better? Is there any way to go smaller than O(n)?
What complexity classes even exist that are smaller than O(n)? O(1) O(log n) Well, on average, we only need to check half the numbers, that’s ½ n which is still O(n) Darn…

6 We can’t do better unless…
We can do better with more information For example, if the list is sorted, then we can use that information somehow How? We can play a High-Low game

7 Binary search Repeatedly divide the search space in half
We’re looking for 37, let’s say 23 54 31 37 Check the middle Check the middle Check the middle Check the middle (Too low) (Too low) (Found it!) (Too high)

8 So, is that faster than linear search?
How long can it take? What if you never find what you’re looking for? Well, then, you’ve narrowed it down to a single spot in the array that doesn’t have what you want And what’s the maximum amount of time that could have taken?

9 Running time for binary search
We cut the search space in half every time At worst, we keep cutting n in half until we get 1 The running time is O(log n) For 64 items log n = 6, for 128 items log n = 7, for 256 items log n = 8, for 512 items log n = 9, ….

10 Guessing game We can apply this idea to a guessing game
First we tell the computer that we are going to guess a number between 1 and n We guess, and it tries to narrow down the number It should only take log n tries log2(1,000,000) is only about 20

11 Interview question This is a classic interview question asked by Microsoft, Amazon, and similar companies Imagine that you have 9 red balls One of them is just slightly heavier than the others, but so slightly that you can’t feel it You have a very accurate two pan balance you can use to compare balls Find the heaviest ball in the smallest number of weighings

12 What’s the smallest possible number?
It’s got to be 8 or fewer We could easily test one ball against every other ball There must be some cleverer way to divide them up Something that is related somehow to binary search

13 That’s it! We can divide the balls in half each time
If those all balance, it must be the one we left out to begin with

14 Nope, we can do better How?
They key is that you can actually cut the number of balls into three parts each time We weigh 3 against 3, if they balance, then we know the 3 left out have the heavy ball When it’s down to 3, weigh 1 against 1, again knowing that it’s the one left out that’s heavy if they balance

15 Thinking outside the box, er, ball
The cool thing is that we are trisecting the search space each time This means that it takes log3 n weighings to find the heaviest ball We can do 8 balls in 2 weighings, 27 balls in 3 weighings, 81 balls in 4 weighings, etc.

16 Sorting Searching is really useful
The idea of O(log n) time makes all sorts of real world applications work Google, for example But, we can’t do binary search unless our list is sorted Like searching, computer scientists have devoted a lot of thought to figuring out the best way to do sorting

17 Sorting The importance of sorting should be evident to you by now
Applications: Sorting a column in Excel Organizing your iTunes playlists by artist name Ranking a high school graduating class Finding a median score to report on an exam Countless others…

18 But, is it interesting? Yes! It’s tricky
No, it’s not! Give me 100 names written on 100 index cards and I can sort them, no problem One way to remind yourself that it’s tricky is by increasing the problem size What if I gave you 1,000,000 names written on 1,000,000 index cards You might need some organizational system

19 Computers are stupid A computer can’t “jump” to the M section, unless you explicitly create an M section or something For most common sorts, the computer has to compare two numbers (or Strings or whatever) at a time Based on that comparison, it has to take another step in the algorithm Remember, we can swap things around in an array

20 Bubble sort is a classic sorting algorithm
It is very simple to understand It is very simple to code It is not very fast The idea is simply to go through your array, swapping out of order elements until nothing is out of order

21 Code for a single pass One “pass” of the bubble sort algorithm goes through the array once, swapping out of order elements for( int j = 0; j < array.length - 1; j++ ) if( array[j] > array[j + 1] ) { int temp = array[j]; array[j] = array[j + 1]; array[j + 1] = temp; }

22 Single pass example Run through the whole array, swapping any entries that are out of order No swap 45 Swap No swap 7 45 54 37 108 51 54 37 Swap No swap 108 51 Swap

23 How many passes do we need?
How bad could it be? What if the array was in reverse-sorted order? One pass would only move the largest number to the bottom We would need n – 1 passes to sort the whole array 6 5 4 3 2 7 1 6 5 4 3 2 1 7 6 5 4 3 7 2 1 6 5 7 4 3 2 1 7 6 5 4 3 2 1 6 7 5 4 3 2 1 6 5 4 7 3 2 1

24 Full bubble sort code The full Java method for bubble sort would require us to have at least n – 1 passes Alternatively, we could keep a flag to indicate that no swaps were needed on a given pass for( int i = 0; i < array.length – 1; i++ ) for( int j = 0; j < array.length - 1; j++ ) if( array[j] > array[j + 1] ) { int temp = array[j]; array[j] = array[j + 1]; array[j + 1] = temp; }

25 Ascending sort The bubble sort we saw sorts integers in ascending order What if you wanted to sort them in descending order? Only a single change is needed to the inner loop: for( int j = 0; j < array.length - 1; j++ ) if( array[j] < array[j + 1] ) { int temp = array[j]; array[j] = array[j + 1]; array[j + 1] = temp; }

26 What’s the running time of bubble sort?
The outer loop runs n – 1 times The inner loop runs n – 1 times The inner loop has a constant amount of work inside of it, call it c (n – 1)(n – 1)c = cn2 – 2cn + c, which is… O(n2) Hmm, not great, let’s try another sort

27 Insertion sort Instead of “bubbling” down the largest (or smallest) number, keep the first k elements sorted, and keep increasing k Philosophically, not that different from bubble sorting The nice thing is that we can stop sorting whenever the new thing we added is in place

28 Insertion sort code for( int i = 1; i < array.length; i++ ) for( int j = i; j > 0; j-- ) //count back if( array[j - 1] > array[j] ) { int temp = array[j]; array[j] = array[j - 1]; array[j - 1] = temp; } else break; The nice thing is that each inner loop runs at most i times

29 What’s the running time of insertion sort?
The outer loop runs n – 1 times Well, each inner loop runs a maximum of i times, where i is the current iteration of the outer loop … + (n – 1) = ? = (n)(n – 1)/2 = ½n2 – ½n, which is… O(n2)

30 Better than quadratic? Is there a way to sort things that is better than quadratic time? Yes! Merge sort Keep dividing your list in half, over and over, until you get down to two lists with one element in each Merge the lists together, sorting them as you do, and merge the sorted list of 2 with another sorted list of 2, then merge lists of 4, and keep going until you have merged everything together It takes O(n log n), which is the best you can do for a comparison based sort

31 Bucket sort paradigm You use bucket sort when you know that your data is in a narrow range, like, the numbers between 1 and 10 or even 1 and 100 As long as the range of possible values is in the neighborhood of the length of your list, bucket sort can do well Example: 150 students with integer grades between 1 and 100 Doesn’t work for sorting doubles or Strings

32 Bucket sort algorithm Make an array with enough elements to hold every possible value in your range of values If you need 1 – 100, make an array with length 100 Sweep through your original list of numbers, when you see a particular value, increment the corresponding index in the value array To get your final sorted list, sweep through your value array and, for every entry with value k > 0, print its index k times

33 Bucket sort example We know our values will be in the range [1,10]
Our example array: Our values array: The result: 6 2 10 1 7 1 3 2 4 5 6 7 8 9 10 1 2 6 7 10

34 Bucket sort in code Here’s bucket sort in code with a range of [min, max]: int[] values = new int[max - min + 1]; for( int i = 0; i < array.length; i++ ) values[array[i] - min]++; int count = 0; for( int i = 0; i < values.length; i++ ) { for( int j = 0; j < values[i]; j++ ) { array[count] = i + min; count++; }

35 How long does bucket sort take?
It takes O(n) time to scan through the original array But, now we have to take into account the number of values we expect So, let’s say we have m possible values It takes O(m) time to scan back through the value array, with O(n) additional updates to the original array Time: O(n + m)


Download ppt "Week 13: Searching and Sorting"

Similar presentations


Ads by Google