Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006.

Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006

Agenda 1. Introduction 2. Sequential Search 3. Binary Search 4. Interpolation Search 5. Indexed Search

1. Introduction What is a Search? What is a Search? “ Searching is the task of finding a certain data item (record) in a large collection of such items.”  A key field that identifies the item sought for is given. (For simplification we consider only the key field instead of the complete record.)  If the item is found either its location or the complete item is returned.  If the item is not found an indication is given, usually by returning a non-existing index such as -1.

2. Sequential Search Sequential Search is also called Exhaustive Search because the complete collection is searched. Sequential Search is also called Exhaustive Search because the complete collection is searched.

2. Sequential Search 15 13 6 20 21 17 8 41 NO YES Key item = list[i] ? Keyitem = 20 i =0 i =1 i =2 return i =2 as found location !

2. Sequential Search The first implementation will be: The first implementation will be: for i = 0 to n do get next item Ai if Ai == k return i endfor return -1

2. Sequential Search Another pseudo code is: Another pseudo code is: i =0 while i k i <-- i+1 if i < n return i else return -1 return -1 Check boundary conditions! ←

2. Sequential Search How is the algorithm implemented? How is the algorithm implemented? The way the collection is constructed affects the way the next item Ai is retrieved. The way the collection is constructed affects the way the next item Ai is retrieved. In a static array: Ai is the indexed item A[i] In a static array: Ai is the indexed item A[i] In a linked list: Ai is the next node to be fetched by following the “next pointer” in the present node. In this case usually the address of the node found (a pointer to the found node) is returned or a NULL pointer to indicate that it was not found In a linked list: Ai is the next node to be fetched by following the “next pointer” in the present node. In this case usually the address of the node found (a pointer to the found node) is returned or a NULL pointer to indicate that it was not found In a file: Ai is the next record retrieved from the file In a file: Ai is the next record retrieved from the file

2. Sequential Search Complexity: Complexity: The basic operation is the comparison The basic operation is the comparison For a collection of n data items there are several cases: For a collection of n data items there are several cases: Best case: item found at the first location Best case: item found at the first location  Number of comparisons = 1 Worst Case: item found at the last location or item not found Worst Case: item found at the last location or item not found  Number of comparisons = n Average case = (1+n)/2 Average case = (1+n)/2

2. Sequential Search Enhancements Sequential Search may be enhanced using several techniques: Sequential Search may be enhanced using several techniques: 1. Sorting before searching (Presorting) 2. Sentinel Search 3. Probabilistic Search

2. Sequential Search Enhancements 1. Presorting A good question to ask before searching is whether the collection is sorted or not? A good question to ask before searching is whether the collection is sorted or not? How do we use that info? If sorted the search is terminated as soon as the value of the indexed item in the collection exceeds that of the search item. How do we use that info? If sorted the search is terminated as soon as the value of the indexed item in the collection exceeds that of the search item. What is the effect? This will not affect the worst case of finding the element at the last position, but it will decrease the average number of comparisons if logic position of the item were somewhere before the end of the list and the element was not found. What is the effect? This will not affect the worst case of finding the element at the last position, but it will decrease the average number of comparisons if logic position of the item were somewhere before the end of the list and the element was not found. A more efficient search is the binary search. A more efficient search is the binary search.

2. Sequential Search Enhancements 2. Sentinel Search The basic loop in sequential sort include 2 comparisons at each iteration The basic loop in sequential sort include 2 comparisons at each iteration while( (i A [ i ]) ) To decrease the number of comparisons to one per iteration a sentinel value = key is inserted at the end of the array (beyond its end, i.e. at n) To decrease the number of comparisons to one per iteration a sentinel value = key is inserted at the end of the array (beyond its end, i.e. at n) Hence the first comparison is redundant. The search will always stop finding key either within A (if it already existed) or outside A if it originally did not exist. Hence the first comparison is redundant. The search will always stop finding key either within A (if it already existed) or outside A if it originally did not exist. A check on the location of key will indicate if it existed or not. A check on the location of key will indicate if it existed or not.

2. Sequential Search Enhancements 3. Probabilistic Search The basic idea here is that popular elements of the list that are searched for more frequently should require less comparisons to find The basic idea here is that popular elements of the list that are searched for more frequently should require less comparisons to find This is implemented by enhancing the location of an element found in the array when searched for, one location ahead by swapping it with the element before it. This is implemented by enhancing the location of an element found in the array when searched for, one location ahead by swapping it with the element before it. Hence, each time an element is found the number of comparisons needed to find it next time is decremented by one Hence, each time an element is found the number of comparisons needed to find it next time is decremented by one

2. Sequential Search Modifying the first sequential algorithm for the case of sorted list would be : Modifying the first sequential algorithm for the case of sorted list would be : for i = 0 to n do if Ai > k return -1 // as list is sorted the // possible location has been passed // possible location has been passed if Ai == k return i return -1

2. Sequential Search Modifying the second sequential algorithm for the case of sorted list would be : Modifying the second sequential algorithm for the case of sorted list would be : i =0 while i < n and next item Ai < k i <-- i+1 if Ai == k and i < n return i else return -1

3. Binary Search How does it work? How does it work?  Basic idea that dividing the list at each search step into 2 sublists and checking the mid item the range to be searched for possible location is either the left or right sublist (i.e. desreased to half ). Note however, that the determination of the middle item in the collection is a simple task if the data collection is represented in memory by a sequential array, whereas it is not so if the collection is represented using a linked list. Hence we will assume that the collection is a sequential array. Note however, that the determination of the middle item in the collection is a simple task if the data collection is represented in memory by a sequential array, whereas it is not so if the collection is represented using a linked list. Hence we will assume that the collection is a sequential array.

2. Sequential Search 15 13 65 20 21 27 38 41 NO YES Key item = list[mid] ? Keyitem = 20 n = 8mid =4 return i =2 as found location ! Key item < list[mid] mid =2 3 comparisons! mid =3 Key item > list[mid]

3. Binary Search For the same input and output specs as before For the same input and output specs as before the algorithm is: low = 0; high = n-1; while (low < high) do { mid = (low+high)/2 if ( k < A [mid] ) then high = mid -1 if ( k < A [mid] ) then high = mid -1 else if ( k > A [mid] then low = mid +1 else if ( k > A [mid] then low = mid +1 else return mid // found else return mid // found } return -1 // not found

3. Binary Search Complexity: Complexity: For a collection of n data items: For a collection of n data items: In each step: the mid item is compared to k and the range of search is divided by 2 In each step: the mid item is compared to k and the range of search is divided by 2 This is repeated until the range is zero (at the worst case). This is repeated until the range is zero (at the worst case). i.e. we should ask: how many times will we divide n by 2 till the length of sublists is zero? i.e. we should ask: how many times will we divide n by 2 till the length of sublists is zero? → log 2 n … which is better than n → log 2 n … which is better than n

4.Interpolation Search What is meant by interpolation? What is meant by interpolation? Here we try to guess more precisely where the search key resides. Here we try to guess more precisely where the search key resides. Instead of calculating the middle as the physical middle (low+high)/2 it is calculated in a weighted manner w.r.t. to the value of k relative to max and min values in the list Instead of calculating the middle as the physical middle (low+high)/2 it is calculated in a weighted manner w.r.t. to the value of k relative to max and min values in the list

4. Interpolation Search Analysis: Analysis:  Calculations are more complex for mid  Significant Improvement in search time especially when values of data items in collection are evenly distributed.

What is an index? What is an index? Similar to the index of a book (e.g. telephone book), items in the index point to significant items in the collection. Similar to the index of a book (e.g. telephone book), items in the index point to significant items in the collection. This implies that in this search an additional table is used … the index table, where each item in the index table points to a specific location in the original search list. This implies that in this search an additional table is used … the index table, where each item in the index table points to a specific location in the original search list.

5. Indexed Search Algorithm: Algorithm: // Input: Search array A of n items + index table of d items + key item k //Output: Location of item with search key or false key Step 1: Determine search range for key within index table by specifying (i min to i max ) inside original search list Step 2: Search sequentially for key in range (i min to i max ) inside original search list

5. Indexed Search Algorithm: Algorithm: 112 385 718 9211 47 11 15 17 38 53 67 71 74 83 92 Index Table Searching for key =53 {01 2 3 4 5 6 7 8 9 10 11 Pos Step 1 Step 2 Pos = 5+1= 6 1

5. Indexed Search Analysis: Assuming that: Analysis: Assuming that:  the original table is of size n  Index is of size d Step 1: Determine search range has average complexity: O( d/2) Step 2: Search for key in range (i min to i max ) inside original search list, assume average range length = n/k

Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006.

Similar presentations

Presentation on theme: "Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006.

Similar presentations

Presentation on theme: "Data Structures and Algorithms Searching Algorithms M. B. Fayek CUFE 2006."— Presentation transcript:

Similar presentations

About project

Feedback