Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quick Search Algorithm A very fast substring search algorithm, SUNDAY D.M., Communications of the ACM. 33(8),1990, pp. 132-142. Adviser: R. C. T. Lee Speaker:

Similar presentations


Presentation on theme: "Quick Search Algorithm A very fast substring search algorithm, SUNDAY D.M., Communications of the ACM. 33(8),1990, pp. 132-142. Adviser: R. C. T. Lee Speaker:"— Presentation transcript:

1 Quick Search Algorithm A very fast substring search algorithm, SUNDAY D.M., Communications of the ACM. 33(8),1990, pp. 132-142. Adviser: R. C. T. Lee Speaker: C. W. Cheng National Chi Nan University

2 Problem Definition Input: a text string T with length n and a pattern string P with length m. Output: all occurrences of P in T.

3 Definition T s : the first character of a string T aligns to a pattern P. P l : the first character of a pattern P aligns to a string T. T j : the character of the jth position of a string T. P i : the character of the ith position of a pattern P. P f : the last character of a pattern P. n : The length of T. m : The length of P.

4 Rule 2-2: 1-Suffix Rule (A Special Version of Rule 2) Consider the 1-suffix x. We may apply Rule 2-2 now.

5 Introduction simplification of the Boyer-Moore algorithm. uses only the bad-character shift. easy to implement. uses Rule 2-2: 1-Suffix Rule

6 Quick Search Rule Suppose that P 1 is aligned to T s now, and we perform a pair-wise comparing between text T and pattern P from left to right. Assume that the first mismatch occurs when comparing T q with P p. Since T q ≠ P p, we move the pattern P to right such that the largest position i in the right of P i is equal to T s+m. We can shift the pattern at least (m-i) positions right. Ttyx Ptzx s Shift i Ptzx s + m p p mismatch i 1 1 q

7 Quick Search Preprocessing Table The only thing we want to do is to construct a table as follow. Let x be a character in the alphabet. We record the position of the last x, if it exists in P, we counted the position of x from the right end. If x does not exist in P, we record it as m+1.

8 Quick Search Preprocessing Table Example : 7 6 5 4 3 2 1 P=CAGAGAG With this table, the number of steps which we move the pattern can be easily done. After the movement, we compare the pattern and the text from left to right until a mismatch occurs, otherwise we output the position of the first character in T which aligns to pattern P. ACGT qsBC2718

9 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718

10 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 mismatch

11 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 mismatch qsBC[G]=1, shift=1

12 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718

13 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 mismatch

14 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 mismatch qsBC[A]=2, shift=2

15 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718

16 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 exact match

17 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 exact match qsBC[T]=8, shift=8

18 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718

19 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 mismatch

20 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 mismatch qsBC[A]=2, shift=2

21 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718

22 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 mismatch

23 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 mismatch qsBC[G]=1, shift=1

24 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718

25 Example Text string T=GCGCAGAGAGTAGAGAGTACG Pattern string P=CAGAGAG GCGCAGAGAGTAGAGAGTACG CAGAGAG ACGT qsBC2718 mismatch

26 Time complexity preprocessing phase in O(m+ σ) time and O(σ) space complexity, σ is the number of alphabets in pattern. searching phase in O(mn) time complexity.

27 Reference [KMP77] Fast pattern matching in strings, D. E. Knuth, J. H. Morris, Jr and V. B. Pratt, SIAM J. Computing, 6, 1977, pp. 323–350. [BM77] A fast string search algorithm, R. S. Boyer and J. S. Moore, Comm. ACM, 20, 1977, pp. 762–772. [S90] A very fast substring search algorithm, D. M. Sunday, Comm. ACM, 33, 1990, pp. 132–142. [RR89] The Rand MH Message Handling system: User’s Manual (UCIVersion), M. T. Rose and J. L. Romine, University of California, Irvine, 1989. [S82] A comparison of three string matching algorithms, G. De V. Smith, Software—Practice and Experience,12, 1982, pp. 57–66. [HS91] Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. [S94] String Searching Algorithms, Stephen, G.A., World Scientific, 1994. [ZT87] On improving the average case of the Boyer-Moore string matching algorithm, ZHU, R.F. and TAKAOKA, T., Journal of Information Processing 10(3), 1987, pp. 173-177. [R92] Tuning the Boyer-Moore-Horspool string searching algorithm, RAITA T., Software - Practice & Experience, 22(10), 1992, pp. 879-884. [S94] On tuning the Boyer-Moore-Horspool string searching algorithms, SMITH, P.D., Software - Practice & Experience, 24(4), 1994, pp. 435-436. [BR92] Average running time of the Boyer-Moore-Horspool algorithm, BAEZA-YATES, R.A., RÉGNIER, M., Theoretical Computer Science 92(1), 1992, pp. 19-31. [H80] Practical fast searching in strings, HORSPOOL R.N., Software - Practice & Experience, 10(6), 1980, pp. 501-506. [L95] Experimental results on string matching algorithms, LECROQ, T., Software - Practice & Experience 25(7), 1995, pp. 727-765.

28 Thanks for your listening


Download ppt "Quick Search Algorithm A very fast substring search algorithm, SUNDAY D.M., Communications of the ACM. 33(8),1990, pp. 132-142. Adviser: R. C. T. Lee Speaker:"

Similar presentations


Ads by Google