Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design and Analysis of Algorithms – Chapter 71 Space-Time Tradeoffs: String Matching Algorithms* Dr. Ying Lu RAIK 283: Data Structures.

Similar presentations


Presentation on theme: "Design and Analysis of Algorithms – Chapter 71 Space-Time Tradeoffs: String Matching Algorithms* Dr. Ying Lu RAIK 283: Data Structures."— Presentation transcript:

1 Design and Analysis of Algorithms – Chapter 71 Space-Time Tradeoffs: String Matching Algorithms* Dr. Ying Lu ylu@cse.unl.edu RAIK 283: Data Structures & Algorithms *slides referred tohttp://www.aw-bc.com/info/levitin

2 Motivation example b How to sort array a[1], a[2], …., a[n], whose values are known to come from a set, e.g., {a, b, c, d, e, f, …. x, y, z}? Design and Analysis of Algorithms – Chapter 72

3 3 Space-time tradeoffs For many problems some extra space really pays off: b Extra space in tables hashinghashing non comparison-based sorting (e.g., sorting by counting the array a[1], a[2], …., a[n], whose values are known to come from a set, e.g., {a, b, c, d, e, f, …. x, y, z})non comparison-based sorting (e.g., sorting by counting the array a[1], a[2], …., a[n], whose values are known to come from a set, e.g., {a, b, c, d, e, f, …. x, y, z}) b Input enhancement auxiliary tables (shift tables for pattern matching)auxiliary tables (shift tables for pattern matching) indexing schemes (e.g., B-trees)indexing schemes (e.g., B-trees) b Tables of solutions for overlapping subproblems dynamic programmingdynamic programming

4 Design and Analysis of Algorithms – Chapter 74 String matching b pattern: a string of m characters to search for b text: a (long) string of n characters to search in b Brute force algorithm?

5 Design and Analysis of Algorithms – Chapter 75 String matching b pattern: a string of m characters to search for b text: a (long) string of n characters to search in b Brute force algorithm: 1.Align pattern at beginning of text 2.moving from left to right, compare each character of pattern to the corresponding character in text until –all characters are found to match (successful search); or –a mismatch is detected 3.while pattern is not found and the text is not yet exhausted, realign pattern one position to the right and repeat step 2. b Time efficiency:

6 Design and Analysis of Algorithms – Chapter 76 String matching b pattern: a string of m characters to search for b text: a (long) string of n characters to search in b Brute force algorithm: 1.Align pattern at beginning of text 2.moving from left to right, compare each character of pattern to the corresponding character in text until –all characters are found to match (successful search); or –a mismatch is detected 3.while pattern is not found and the text is not yet exhausted, realign pattern one position to the right and repeat step 2. b Time efficiency:  (nm)

7 Design and Analysis of Algorithms – Chapter 77 Horspool’s Algorithm b A simplified version of Boyer-Moore algorithm that retains key insights: compare pattern characters to text from right to leftcompare pattern characters to text from right to left given a pattern, create a shift table that determines how much to shift the pattern when a mismatch occurs (input enhancement)given a pattern, create a shift table that determines how much to shift the pattern when a mismatch occurs (input enhancement)

8 Design and Analysis of Algorithms – Chapter 78 How far to shift when a mismatch occurs ? Look at first (rightmost) character in text that was compared. Three cases: The character is not in the patternThe character is not in the pattern.....c...................... ( c not in pattern).....c...................... ( c not in pattern) BAOBAB BAOBAB The character is in the pattern (but not at rightmost position)The character is in the pattern (but not at rightmost position).....O...................... ( O occurs once in pattern).....O...................... ( O occurs once in pattern) BAOBAB BAOBAB.....A...................... ( A occurs twice in pattern).....A...................... ( A occurs twice in pattern) BAOBAB BAOBAB The rightmost characters produced a matchThe rightmost characters produced a match.....B...........................B...................... BAOBAB BAOBAB.....B...........................B...................... AAOAAB AAOAAB b Shift Table: Stores number of characters to shift depending on first character compared

9 Shift table b Indexed by text and pattern alphabet b If a character c is not among the first m-1 characters of the pattern, then t(c) = the pattern’s length m b Otherwise, t(c) = the distance from the rightmost c among the first m-1 characters of the pattern to its last character Design and Analysis of Algorithms – Chapter 79

10 10 Shift table b Constructed by scanning pattern before search begins b Indexed by text and pattern alphabet  All entries are initialized to length of pattern. E.g., BAOBAB: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 6 6 6 6 6 6 6 6 6 6 6 6 6  For c occurring in the first m-1 characters of the pattern, update table entry to distance of rightmost c among the first m-1 characters of the pattern to its last character

11 Design and Analysis of Algorithms – Chapter 711 Shift table b Constructed by scanning pattern before search begins b Indexed by text and pattern alphabet  All entries are initialized to length of pattern. E.g., BAOBAB: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 6 6 6 6 6 6 6 6 6 6 6 6 6  For c occurring in the first m-1 characters of the pattern, update table entry to distance of rightmost c among the first m-1 characters of the pattern to its last character

12 Shift table b We can create the shift table by processing pattern from L→R: Algorithm ShiftTable(P[0…m-1]) { initialize all the elements of Table with m for j = 0 to m-2 do Table[P[j]] = m-1-j return Table } Design and Analysis of Algorithms – Chapter 712

13 Design and Analysis of Algorithms – Chapter 713 Example BARD LOVED BANANAS BAOBAB A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

14 Design and Analysis of Algorithms – Chapter 714 In-class exercises b P267: 7.2.1--- Apply Horspool’s algorithm to search for the pattern BAOBAB in the text BESS_KNEW_ABOUT_BAOBABS b P268: 7.2.3--- How many character comparisons will be made by Horspool’s algorithm in searching for each of the following patterns in the binary text of 1000 zeros? –00001 –10000

15 Design and Analysis of Algorithms – Chapter 715 In-class exercises b P264: 7.2.1--- Apply Horspool’s algorithm to search for the pattern BAOBAB in the text BESS_KNEW_ABOUT_BAOBABS b P264: 7.2.3--- How many character comparisons will be made by Horspool’s algorithm in searching for each of the following patterns in the binary text of 1000 zeros? –00001 (1*996 = 996 comparisons) –10000 (5*996 = 4980 comparisons)

16 Time Efficiency b Horspool’s algorithm  (nm) in the worst-case.  (nm) in the worst-case.  (n) for random texts  (n) for random texts Design and Analysis of Algorithms – Chapter 716

17 In-Class Exercises b You have an array of positive integers like ar[]={1,3,2,4,5,4,2}. You need to create another array ar_low[] such that ar_low[i] = number of elements lower than or equal to ar[i]. So the output of above should be {1,4,3,6,7,6,3}. b Algorithm time complexity should be O(n) and use of extra space is allowed.

18 Design and Analysis of Algorithms – Chapter 718 Boyer-Moore algorithm b Based on same two ideas: compare pattern characters to text from right to leftcompare pattern characters to text from right to left given a pattern, create a shift table that determines how much to shift the pattern when a mismatch occurs (input enhancement)given a pattern, create a shift table that determines how much to shift the pattern when a mismatch occurs (input enhancement) b Uses additional shift table with same idea applied to the number of matched characters

19 Design and Analysis of Algorithms – Chapter 719 In-class exercises b P264: 7.2.7 How many character comparisons will Boyer- Moore algorithm make in searching for each of the following patterns in the binary text of 1000 zeros? –00001 –10000

20 Design and Analysis of Algorithms – Chapter 720 In-class exercises b P264: 7.2.7 How many character comparisons will Boyer- Moore algorithm make in searching for each of the following patterns in the binary text of 1000 zeros? –00001 (1*996 = 996 comparisons) –10000 (5*200 = 1000 comparisons)

21 In-class exercises b P264: 7.2.3--- How many character comparisons will be made by Horspool’s algorithm in searching for each of the following patterns in the binary text of 1000 zeros? –01010 b P264: 7.2.7 How many character comparisons will Boyer- Moore algorithm make in searching for each of the following patterns in the binary text of 1000 zeros? –01010 Design and Analysis of Algorithms – Chapter 721


Download ppt "Design and Analysis of Algorithms – Chapter 71 Space-Time Tradeoffs: String Matching Algorithms* Dr. Ying Lu RAIK 283: Data Structures."

Similar presentations


Ads by Google