Presentation is loading. Please wait.

Presentation is loading. Please wait.

Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search.

Similar presentations


Presentation on theme: "Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search."— Presentation transcript:

1 Searching Chapter 7

2 Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search. – 4 different versions. – Calculate the computational complexity. Discuss comparison trees and how they can be used to analyze algorithm performance. – Internal path length – External path length – Average path length

3 Homework Overview Written (max 40 points) – 7.2 E3(4 pts) – 7.3 E1 (a, b, c, d)(2 pts each) – 7.4 E1 (a, b, c, d)(3 pts each) – 7.4 E2(5 pts) – 7.4 E3(10 pts) – 7.6 E1 (a, b, c, d)(2 pts each) – 7.6 E2 (6 pts) – 7.6 E5 (a, b, c, d, e, f, g, h)(1 pt each) – 7.6 E6 (a, b, c, d)(2 pts each) Programming (max 20 points) – 7.2 E4(8 pts) – 7.2 P2(12 pts) – 7.4 P1(15 pts)

4 Searching A very common problem in computer science is trying to find a particular data entry. There are two main strategies. – Use a general storage type and then produce and algorithm to search within that type. – Design special storage types that make searching more efficient. In general we assume each entry has a key. – Name – ID number – Value – etc. We search the entries until we find the desired key.

5 Sequential Search If there is no organizational structure to the data then the only real strategy is a sequential search. – We start at one end of the list and examine each key in turn. for (position = 0; position < size; position++){ the_list.retrieve(position, data); if (data == target) return success; } return not_present;

6 Complexity of Sequential Search To determine the computational cost of doing this search we count how many times some representative operation occurs. – We will choose to count the number of comparisons. – For some data types, comparisons may be very expensive For example comparing long strings. – How many times is the == operator used? The answer depends on where (or if) the target key is stored in the list. – We could get lucky and it the target on the first comparison. – We could find the key on the last comparison. – If the key is not in the list we will need to look at every entry to make sure.

7 Complexity of Sequential Search Let’s assume we know the key is in the list so the search will be successful. – Let’s also assume the key has an equal probability of being in any location in the list. Let n be the number of entries in the list. We could find the desired key after 1, 2, 3, …, n comparisons all with equal probability. The average search time is:

8 Key Class In order to count the number of comparisons in the run of an actual program, it is helpful to create a custom key class. This class will represent the key (any data type) but more importantly it will allow us to overload the comparison operators. The class will contain a static variable that will count the number of comparisons. Each time a comparison is made, the overloaded operator will add one to the comparison count. We can examine the comparison count at the end of the program.

9 Key Class Definition class Key { int key; public: static int comparisons; Key (int x = 0); int the_key() const; }; bool operator == (const Key &x, const Key &y); bool operator > (const Key &x, const Key &y); bool operator < (const Key &x, const Key &y); bool operator >= (const Key &x, const Key &y); bool operator <= (const Key &x, const Key &y); bool operator != (const Key &x, const Key &y); int Key::comparisons = 0; Note the static variable is assigned its initial value outside of any function. – It is accessed using the class name and the scope resolution operator.

10 Key Class Methods The constructor and accessor methods are simple. Key::Key(int x) { key = x; } int Key::the_key() const { return key; } The operators are also simple. – They use the accessor method and the default comparison to do their job. – They increment the comparison count. – They are all similar to the following. bool operator == (const Key &x, const Key &y) { Key::comparisons++; return x.the_key() == y.the_key(); }

11 Sequential Search Testing Program Now that we have a key class that can count comparisons for us, we can write a program to test sequential search. We will generate a list of odd entries from a known range of values. We will repeatedly select a random value that we know is in the list and search for it. – Computing the average number of comparisons over a large number of runs. – We will also calculate the run time for these searches. We will then repeatedly select a random entry we know is not in the list and search for it. – Computing the average number of comparisons over a large number of runs. – We will also calculate the run time for these searches.

12 Random, Timer and List To generate the random numbers we will use the Random class defined in Appendix B of the book. To calculate the run times we will use the Timer class defined in Appendix C of the book. We have used similar code before so we won’t go into the details here. Finally, we will use one of the list packages (all should work) that we developed in the last chapter. – You will need to add not_present to the enumeration of the return values.

13 Main Function First, we will just be storing Keys in the list. typedef Key Record; The main function asks the user for details, creates the list and then call test_search. int main() { int items, searches; List the_list; Key::comparisons = 0; cout << "How many items should be stored in the list? " << flush; cin >> items; if (items < 0) { cout << "Error: the number of items must be nonnegative." << endl; exit(1); } cout << "How many searches should be performed? " << flush; cin >> searches; if (searches <= 0) { cout << "Error: the number of searches must be positive." << endl; exit(1); } for (int i = 0; i < items; i++) the_list.insert(i, 2 * i + 1); test_search(searches, the_list); }

14 test_search Function void test_search(int searches, List &the_list) /* Pre: The number searches is a positive integer and the List the_list has been filled some number of integers. Post: Statistics are printed about the performance of searching algorithms when the searched for key is present in the list and when it is absent. Uses: The List class, the Random number class, the Key class, the Timer class, and the function sequential_search. */ { int list_size = the_list.size(); if (searches <= 0 || list_size < 0){ cout << " Exiting test: " << endl << " The number of searches must be positive." << endl << " The number of list entries must exceed 0." << endl; return; } int i, target, found_at; Key::comparisons = 0; Random number; Timer clock; for (i = 0; i < searches; i++){ target = 2 * number.random_integer(0, list_size - 1) + 1; if (sequential_search(the_list, target, found_at) == not_present) cout << "Error: Failed to find expected target " << target << endl; }

15 test_search Function print_out("Successful", clock.elapsed_time(), Key::comparisons, searches); Key::comparisons = 0; clock.reset(); for (i = 0; i < searches; i++){ target = 2 * number.random_integer(0, list_size); if (sequential_search(the_list, target, found_at) == success) cout << "Error: Found unexpected target " << target << " at " << found_at << endl; } print_out("Unsuccessful", clock.elapsed_time(), Key::comparisons, searches); }

16 Sequential Search Function Error_code sequential_search(const List &the_list, const Key &target, int &position) /* Post: If an entry in the_list has key equal to target, the return success and the output parameter position locates such an entry within the list. Otherwise return not_present and position becomes invalid. */ { int s = the_list.size(); for (position = 0; position < s; position++){ Record data; the_list.retrieve(position, data); if (data == target) return success; } return not_present; }

17 print_out Function void print_out(char *search, double time, int comparisons, int searches) /* Pre: search is a string describing a search. Post: Statistics about the search are printed out. */ { cout << "The search " << search << " took " << time << " seconds and " << comparisons << " comparisons to make " << searches << " searches." << endl; cout << "This results in an average search time of " << time / searches << " and an average number of comparisons of " << comparisons / searches << "." << endl; }

18 Sample Output Here is a sample of the output of the testing program. How many items should be stored in the list? 1000 How many searches should be performed? 100 The search Successful took 0.002343 seconds and 49474 comparisons to make 100 searches. This results in an average search time of 2.343e-05 and an average number of comparisons of 494. The search Unsuccessful took 0.004342 seconds and 100000 comparisons to make 100 searches. This results in an average search time of 4.342e-05 and an average number of comparisons of 1000. We expected the average number of comparisons to be 1001/2 = 500.5 which is slightly different than the actual result. The time for a successful search was on average about half the time for an unsuccessful search.

19 Computational Complexity For successful searches: – The average number of comparisons is approximately half of the number of items in the list. For both successful and unsuccessful searches: – When the number of entries (and the number of comparisons) increases by a factor of 10, the run time increases by a little less than 10 times. – Both the run time and the number of comparisons are linear functions of the number of items. – There are a lot of details here that depend on the computer, compiler, language, programmer skill, etc. – We use a shorthand notation, O(n), to say the runtime is a linear function of the list size. Successful SearchesUnsuccessful Searches nAve. comp.Ave. timeAve. comp.Ave. time 1053.70e-07105.50e-07 100492.46e-061004.96e-06 10004942.34e-0510004.34e-05 1000049420.00018757100000.00033711 100000494250.001263531000000.00294095

20 Homework – Section 7.2 (page 276) Written – E3 (written on paper) (4 pts) Programming – E4 (email code) (8 pts) – P2 (email code and written report) (12 pts)

21 Binary Search If the list is ordered (say from the smallest key to the largest key) then we can do much better than sequential search. With binary search we can divide the list in two and eliminate the half that we know does not contain the desired key. We divide the list in half. Since the keys are ordered and the desired key is larger than the mid key we know that the desired entry (if it exists) is in the top half of the list.

22 Binary Search We have cut the size of the problem in half with one comparison! We can repeat the problem, resetting bottom and top to indicate the part of the list that still might contain the desired key.

23 Binary Search Termination There are several options when implementing the binary search algorithm. First, when do we terminate? There are two options for terminating the division: 1.Stupid condition: We have a list with one entry ( top == bottom ). With this method we might keep going after we have “found” the target key. 2.Clever condition: We have a list with one entry or we find the key ( top == bottom || data == target ). This has the penalty of an extra comparison.

24 Recursion We can also implement the binary search recursively or iteratively. This leaves us with 4 possible solutions: Which method is fastest and has the least number of comparisons? – Let’s implement them all and test them just the way we did with the sequential search. Recursive, StupidRecursive, Clever Iterative, StupidIterative, Clever

25 Ordered Lists To enforce the fact that our list must be ordered, we will create an extension of the list class. The new class will be called Ordered_list and will overload (replace) the insert and replace methods. – The new versions will ensure that the list is always in sorted order.

26 Ordered List Class class Ordered_list: public List { public: Ordered_list(); /* Post: The Ordered_list is initialized to be empty. */ Error_code insert(const Record &data); /* Post: If the Ordered_list is not full, the function succeeds: the Record data is inserted into the list following the last entry of the list with a strictly lesser key (or in the first position if no element has a lesser key). Else: the function fails with the diagnostic Error_code overflow. */ Error_code insert(int position, const Record &data); /* Post: If the Ordered_list is not full, 0 <= position <= n, where n is the number of elements in the list, and the Record data can be inserted at position in the list, without disturbing the list order, then the function succeeds: Any enry formerly in position and all later entries have their position numbers increased by 1 and data is inserted at position of the List. Else: the function fails with a diagnostic Error_code. */ Error_code replace (int position, const Record &data); /* Post: If the entry at position can be replaced with data without disturbing the list order, then the function succeeds and the entry is replaced. Else: the function fails with a diagnostic Error_code. */ };

27 Ordered List Methods Ordered_list::Ordered_list() /* Post: The Ordered_list is initialized to be empty. */ { count = 0; } Error_code Ordered_list::insert(const Record &data) /* Post: If the Ordered_list is not full, the function succeeds: the Record data is inserted into the list following the last entry of the list with a strictly lesser key (or in the first position if no element has a lesser key). Else: the function fails with the diagnostic Error_code overflow.*/ { int s = size(); int position; for (position = 0; position < s; position++){ Record list_data; retrieve(position, list_data); if (data >= list_data) break; } return List ::insert(position, data); }

28 Ordered List Methods Error_code Ordered_list::insert(int position, const Record &data) /* Post: If the Ordered_list is not full, 0 <= position <= n, where n is the number of elements in the list, and the Record data can be inserted at position in the list, without disturbing the list order, then the function succeeds: Any enry formerly in position and all later entries have their position numbers increased by 1 and data is inserted at position of the List. Else: the function fails with a diagnostic Error_code. */ { Record list_data; if (position > 0){ retrieve(position - 1, list_data); if (data < list_data) return fail; } if (position < size()){ retrieve(position, list_data); if (data > list_data) return fail; } return List ::insert(position, data); }

29 Ordered List Methods Error_code Ordered_list::replace (int position, const Record &data) /* Post: If the entry at position can be replaced with data without disturbing the list order, then the function succeeds and the entry is replaced. Else: the function fails with a diagnostic Error_code. */ { Record list_data; if (position > 0){ retrieve(position - 1, list_data); if (data < list_data) return fail; } if (position < size()){ retrieve(position + 1, list_data); if (data > list_data) return fail; } return List ::replace(position, data); }

30 Binary Search Algorithm Binary search is famous for being coded incorrectly – be careful. We need to carefully define our variables: –top and bottom will be indices enclosing the part of the list in which we are searching for the target key. At each step we will reduce the region between top and bottom by about half. The following is our loop invariant: – The target key, provided it is present in the list will be found between the indices bottom and top inclusive. We will start with the following values: –bottom = 0 –top = list.size() – 1

31 Binary Search Algorithm To actually do the searching we calculate the midpoint in the list mid=(bottom + top)/2 We will compare the target key to the key at position mid. – If the target key is greater than the key at position mid then the target can only lie in the top half of the list. bottom = mid + 1. – If the target key is less than or equal to the key at position mid then the target can only lie in the bottom half of the list. top = mid. This process repeats until top <= bottom. – Alternatively we could also terminate when the target key == the key at position mid. The process can be either iterative or recursive.

32 Stupid Recursive Algorithm Simplification Step: If the target key > the key at position mid then repeat the problem with bottom = mid + 1 Otherwise repeat with top = mid Base Case: if top <= bottom then the list is has at most one entry. Check this entry to see if it is the target.

33 Stupid Recursive Version Error_code recursive_binary_1(const Ordered_list &the_list, const Key &target, int bottom, int top, int &position) /* Pre: The indices bottom to top define the range to search for the target. Post: If a Record in the range from bottom to top in the_list has key equal to target, then position locates one such entry and success is returned. Otherwise, not_present is returned and position becomes undefined. { Record data; if (bottom < top){ // List has more than one entry. int mid = (bottom + top) / 2; the_list.retrieve(mid, data); if (data < target) // Reduce to top half of the list. return recursive_binary_1(the_list, target, mid + 1, top, position); else // Reduce to bottom half of the list. return recursive_binary_1(the_list, target, bottom, mid, position); } else if (top < bottom) return not_present; // List is empty. else { position = bottom; the_list.retrieve(bottom, data); if (data == target) return success; else return not_present; } }

34 Stupid Recursive Version So that a user of this algorithm can call it like any other sorting algorithm we introduce a simple function to arrange the parameters into the correct format for the recursion. Error_code run_recursive_binary_1(const Ordered_list &the_list, const Key &target, int &position) /* Post: If a Record in the_list has key equal to target, then position locates one such entry and a code of success is returned. Otherwise, the Error_code of not_present is returned and position becomes undefined. Uses: recursive_binary_1 and methods of the classes Ordered_list and Record. */ { return recursive_binary_1(the_list, target, 0, the_list.size() - 1, position); }

35 Stupid Iterative Algorithm Since the recursion is tail recursion, it is fairly simple to write an iterative version of the same algorithm. In this case we do not need a special function just to set up the correct parameters.

36 Stupid Iterative Version Error_code binary_search_1(const Ordered_list &the_list, const Key &target, int &position) /* Post: If a Record in the_list has key equal to target, then position locates one such entry and a code of success is returned. Otherwise, the Error_code of not_present is returned and position becomes undefined. Uses: Methods of the classes Ordered_list and Record. */ { Record data; int bottom = 0, top = the_list.size() - 1; while(bottom < top){ int mid = (bottom + top) / 2; the_list.retrieve(mid, data); if (data < target) bottom = mid + 1; else top = mid; } if (top < bottom) return not_present; else{ position = bottom; the_list.retrieve(bottom, data); if (data == target) return success; else return not_present; } }

37 Clever Versions If we check to see in the target key == the key at position mid then we might get lucky and get to quit early. The modifications to the code are fairly simple. Will the possibility of quitting early be worth the extra comparison at each step? – We will run an experiment to see.

38 Clever Recursive Version Error_code recursive_binary_2(const Ordered_list &the_list, const Key &target, int bottom, int top, int &position) /* Pre: The indices bottom to top define the range in the list to search for the target. Post: If a Record in the range of locations from bottom to top in the_list has key equal to target, then position locates one such entry and a code of success is returned. Otherwise, the Error_code of not_present is returned and position becomes undefined. Uses: recursive_binary_2 and methods of the classes Ordered_list and Record. */ { Record data; if (bottom <= top){ int mid = (bottom + top) / 2; the_list.retrieve(mid, data); if (data == target){ position = mid; return success; } else if (data < target) // Reduce to top half of the list. return recursive_binary_2(the_list, target, mid + 1, top, position); else // Reduce to bottom half of the list. return recursive_binary_2(the_list, target, bottom, mid - 1, position); } else return not_present; // List is empty. }

39 Clever Iterative Version Error_code binary_search_2(const Ordered_list &the_list, const Key &target, int &position) /* Post: If a Record in the_list has key equal to target, then position locates one such entry and a code of success is returned. Otherwise, the Error_code of not_present is returned and position becomes undefined. Uses: Methods of the classes Ordered_list and Record. */ { Record data; int bottom = 0, top = the_list.size() - 1; while(bottom <= top){ position = (bottom + top) / 2; the_list.retrieve(position, data); if (data == target) return success; if (data < target) bottom = position + 1; else top = position - 1; } return not_present; }

40 Modify Main The main function we used to test the sequential search can be modified in a fairly obvious manner to test these 4 different versions of binary search.

41 Comparison of Methods First, let’s look at successful searches. The clever versions are faster for short lists, but as the lists get longer, eventually the stupid versions win. – If comparisons were more expensive, the stupid version would be the clear winner. Iterative versions are generally a little faster than the recursive versions. All of these are much faster than sequential search for long lists. SequentialStupid, recursiveStupid, iterative Clever, recursive Clever, iterative n CompTimeCompTimeCompTimeCompTimeCompTime 10 53.70e-07 44e-0743.5e-0743e-0742.3e-07 100 492.46e-06 71.08e-0679.8e-07107.8e-07106.9e-07 1000 4942.34e-05 105.17e-06104.29e-06174.36e-06174.15e-06 10000 49420.00018757 144.772e-05142.928e-05232.687e-05232.643e-05 100000 494250.00126353 170.0003288170.00024821300.00025079300.00025482

42 Comparison of Methods Next, let’s look at unsuccessful searches. Unlike sequential search the run times are similar between successful and unsuccessful searches. The clever version has an even larger number of comparisons, the stupid version remains the same. Otherwise the results are similar. SequentialStupid, recursiveStupid, iterative Clever, recursive Clever, iterative n CompTimeCompTimeCompTimeCompTimeCompTime 10 5.50e-07 43.1e-0742.7e-0772.9e-0773.8e-07 100 4.96e-06 71.1e-0679.7e-07138.2e-07131e-06 1000 4.34e-05 104.39e-06104.28e-06194.41e-06205.89e-06 10000 0.00033711 143.107e-05142.613e-05262.68e-05262.701e-05 100000 0.00294095 170.0003201170.0003494330.00026173330.00025522

43 Conclusions It seems that the clever version is not worth the trouble, particularly if comparisons are expensive. – For example comparing strings. Similarly, the recursive version has slightly worse performance and the iterative version may be easier to understand. Winner – the stupid iterative version!

44 Computational Complexity Notice the relationship between the number of comparisons and the logarithm of the size of the list. We say that binary search is O(log n). nComp.log 2 n 1043.32 10076.64 1000109.97 100001413.29 1000001716.61

45 Homework – Section 7.3 (page 285) Written – E1 (a, b, c, d) (written on paper) (2 pts each)

46 Comparison trees Our analysis of binary search algorithms depended on: – A particular implementation – A particular computer – A particular operating system – A particular language – A particular compiler It would be nice to have a general analysis that would avoid all these issues. One method is to construct a comparison tree (decision tree or search tree). This tree represents each comparison (or decision) in the algorithm.

47 Sequential Search Comparison Tree Suppose we are searching the list 1, 2, 3, …, n using sequential search. The following is the comparison tree. – Each circle represents a comparison. – Each square represents a possible result of the search. – The F result means the target key was not in the list. First the target is compared to entry 1. – If they are equal we have found the target and we are done. – If they are not equal then move on to entry 2. – etc.

48 Stupid Binary Search Comparison Tree Suppose we want to search the list 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 using the stupid version of binary search. This is the search tree. The height of the tree represents the number of comparisons in the worst case. – In this case there might be 5 comparisons, but many cases need only 4.

49 Root, Leaf and Path Length The initial comparison is called the root of the tree and will be made in all cases. Each ultimate result is a leaf of the tree. The path length is the number of interior vertices (circles) between the root and the leaf. The path length for a particular target key corresponds to the number of comparisons needed for the search. For example if the target key is 7, the path involves the following comparisons. – Compare to 5 (greater than) – Compare to 8 (less than or equal to) – Compare to 7 (less than or equal to) – Compare to 6 (greater than) – Compare to 7 (equal) – The total number of comparisons is 5

50 Stupid Average Path Length We want to know the average number of comparisons. For the searches using the stupid version: – 12 paths of length 4 – 8 paths of length 5 All of these paths begin at the root and end at a leaf and are called external paths. Adding up the lengths of all the external paths produces the external path length of the tree. The average number of comparisons in a search is

51 Clever Binary Search Comparison Tree Suppose we want to search the list 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 using the clever version of binary search. This is the search tree. In this case there might be anywhere between 1 and 8 comparisons.

52 Clever Binary Search Comparison Tree The tree for the clever method is somewhat complicated. We can simplify it by combining pairs of comparisons into a single circle. Here a circle represents: – One comparison if the target key is found. – Two comparisons if the target is not found.

53 Clever Path Length In this clever implementation, if the target key is 7, the path involves the following comparisons. – 5 (not equal) – 5 (greater than) – 8 (not equal) – 8 (less than) – 6 (not equal) – 6 (greater than) – 7 (equal) – There are a total of 7 comparisons.

54 Clever Average Successful Path Length Now a search could terminate at any vertex. – Successful searches end at interior vertices – Unsuccessful searches end at leaves. To calculate the average number of comparisons for a successful search we need the length of all the paths from the root to an interior vertex. – This is the interior path length of the tree.

55 Clever Average Successful Path Length There are 10 interior paths with lengths: – 1 path of length 0 (one comparison) – 2 paths of length 1 (three comparisons) – 4 paths of length 2 (five comparisons) – 3 paths of length 3 (seven comparisons) The total interior path length is Every vertex in the path represents 2 comparisons plus one for the terminating node. The average number of comparisons in a successful search is

56 Clever Average Unsuccessful Path Length There are 11 unsuccessful searches using the clever version. They all end in leaves so we will calculate the external path length. There are: – 5 paths of length 3 – 6 paths of length 4 The total external path length is Every vertex in these paths represents 2 comparisons. Average number of comparisons is The big penalty for the clever version comes in the unsuccessful case.

57 Homework – Section 7.4 (page 296) Exercises 7.4 (page 296) – E1(a, b, c, d) (written on paper) (3 pts each) – E2 (written on paper) (5 pts) Programming – P1 (email code, written report) (15 pts)

58 Extending to Larger Trees We want to extend our results to larger cases without going through the pain of actually drawing the decision trees. A 2-tree is a tree where every vertex except the leaves have two children. This means we can predict the maximum possible number of vertices at each level. LevelMax. # of vertices 01 12 24 38 …… t2t2t

59 Extending to Larger Trees This means that if we know we have k vertices on level t then

60 Extending to Larger Trees We often want to round our results and there are two possibilities. – The floor of x (written ) is the largest integer less than or equal to x. (round down) – The ceiling of x (written ) is the smallest integer greater than or equal to x. (round up) Notice that

61 Analysis of Stupid Method Suppose we are searching a list of n items. There are n successful outcomes and the last step is to check for equality with two possible outcomes. Therefore, there are 2n leaves. The number of levels on the tree must be This is also the maximum number of comparisons. Notice that we can either end at level t or at level t-1 and that and In all cases there will be between lgn and lgn + 2 comparisons.

62 Analysis of Clever Method – Unsuccessful Searches In the clever method all unsuccessful searches end in leaves on the last two levels. If we are searching a list on n items then there are n+1 leaves. – Less than smallest key – Between each pair of adjacent keys. – More than the largest key. The height of the tree is Each level in the tree corresponds to two comparisons and the leaves are on either level t or level t-1. This means the number of comparisons will be between and As we have seen before this is around twice the number with the stupid method.

63 Internal vs. External Vertices To compute the average number of successful searches we need a fact about the relationship between the path lengths of internal and external vertices of a 2- tree. – Let E be the external path length. – Let I be the internal path length. – Let q be the number of internal vertices (not leaves) It is a general fact that E = I + 2q.

64 Internal vs. External Vertices To see that E = I + 2q we need to use a proof by induction. Base case: Suppose a tree contains only the root. In this case E = I = q = 0 so the equation is true. Induction Step: We build a larger 2-tree from a simpler one. – Suppose we have a 2-tree (with values E 1, I 1 and q 1 ) where E 1 = I 1 + 2q 1. – Pick a leaf v with path length k from the root. – Add two children to v so that it is no longer a leaf. – This produces a new 2-tree (with values E 2, I 2 and q 2 ). – Notice that v is in both trees but in the new tree it is no longer a leaf so q 2 = q 1 + 1. – Also the internal path length is now I 2 = I 1 + k. – Finally, there are two new leaves at level k + 1 but one fewer leaf at level k. – This means E 2 = E 1 + 2(k+1) – k. – Now notice that E 2 = E 1 + 2(k + 1) – k = I 1 + 2q 1 + k + 2 = (I 1 + k) + 2 (q 1 +1) = I 2 + 2q 2.

65 Analysis of Clever Method – Successful Searches In the clever method the path length to the leaves is either or There are n+1 leaves so external path length is Each internal node corresponds to a unique list key, so q = n. This means Recall the number of comparisons is 2I + q. – Every node on the each path makes 2 comparisons. – The terminating node makes 1. Thus the average number of comparisons is

66 Analysis of Clever Method – Successful Searches Notice that for large n, and So the average number of comparisons in a successful search is approximately The only thing different from the stupid method here is the -3. With a big penalty for unsuccessful searches. Moral: – For short lists (<= 8) use sequential search. – For longer use the stupid binary search.

67 Homework – Section 7.4 (page 297) Written – E3 (written on paper) (10 pts)

68 Asymptotics When we are talking about the run time of an algorithm we often are only worried about what happens for large problems. We also don’t want to focus on details that would depend on a particular system. We want to compare our run times to a “library” of basic functions. – g(n) = 1 (constant) – g(n) = log n (logarithmic) – g(n) = n (linear) – g(n) = n 2 (quadratic) – g(n) = n 3 (cubic) – g(n) = 2 n (exponential)

69 Asymptotics TestConclusion f(n) has a smaller order of magnitude than g(n). f(n) is growing slower than g(n) is finite (not 0, not infinity) f(n) has the same order of magnitude as g(n). The growth of f(n) and g(n) only differs by a multiplied constant. f(n) has a smaller order of magnitude than g(n). f(n) is growing slower than g(n).

70 Big O Notation NotationNameComparison little o<= 0 big O<=>= 0, finite big theta=nonzero, finite big omega>=nonzero, could be infinite We introduce a new notation to express the different asymptotic growth rates.

71 Comparisons By Growth Rate A chart can make the relationships between our “library” functions clear. n1lg nn lg nn2n2 n3n3 2n2n 1100112 1013.323310010001024 10016.6466410,0001,000,0001.268x10 30 100019.9799701,000,00 0 10 9 1.072x10 301

72 Rules for Big O Calculations 1.Ignore multiplied constants. – so 2.Ignore all but the fastest growing term. – so

73 Rules for Big O Calculations 3.The base for logarithms doesn’t matter. – so

74 Homework – Section 7.6 (page 312) Exercises 7.6 (page 312) – E1 (a, b, c, d) (written on paper) (2 pts each) – E2 (written on paper) (6 pts) – E5 (a, b, c, d, e, f, g, h) (written on paper) (1 pt each) – E6 (a, b, c, d) (written on paper) (2 pts each)


Download ppt "Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search."

Similar presentations


Ads by Google