Download presentation

Presentation is loading. Please wait.

Published byHannah McNally Modified over 4 years ago

1
1 Approximate string matching using factor automata J. Holub and B. Melichar Theoretical Computer Science vol.249 p.305-311 Speaker: L. C. Chen Advisor: R. C. T. Lee

2
2 Problem D L (P, X) between strings P and X is the minimum number of edit operations (substitution, insertion and deletion) needed to convert string P to X. Given a text T, a pattern P, and an integer k, k m n, approximate string matching can be defined as determining whether string X occurs in text T such that edit distance D L (P, X) between pattern P and string X is less than or equal to k.

3
3 An example of Edit Distance To convert P into T: P = abcde T = bcfeg P = abcde T = bcfeg P 1 = bcde P 2 = bcfe f g Delete a Substitute d with f Insert

4
4 Basic definition Fac(T): a set contains all the substrings of text T. A nondeterministic finite automaton (NFA) is a five- tuple M=(Q, Σ, δ, q 0, F), where Q is a finite set of states, Σ is a finite input alphabet, δ is a mapping from Q×(Σ {ε}) into the set of subsets of Q, q 0 Q is an initial state, and F Q is a set of final states. M(Fac(T)): a factor automaton accepts Fac(T).

5
5 T=aabbabd Fac(T)={a,b,d,aa,ab,bb,ba,bd,aab,abb,bba,bab,abd,aabb,abba,bbab,babd aabba,abbab,bbabd,aabbab,abbabd,aabbabd} Factor automaton Factor automation M(Fac(T)): a deterministic finite automaton (DFA) accepts all substrings of the given text T.

6
6 A suffix tree can also be used to recognize all substrings of T=aabbabd, Fac(T)={a,b,d,aa,ab,bb,ba,bd,aab,abb,bba,bab,abd,aabb,abba,bbab,babd aabba,abbab,bbabd,aabbab,abbabd,aabbabd}

7
7 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. One matched, 0 error. One matched, one error. Three matched, 0 error.

8
8 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. Recognize ab

9
9 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. Recognize aab

10
10 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. Recognize bbab

11
11 Definition Let An automaton for intersection of M 1 and M 2 is an automaton

12
12 T=aabbabd P = bab, k=1 Intersection of M(Lk(P)) and M(Fac(T)). Solutions : {ba, bab, bb, bbab, aab, ab} (All end with {3,0} or {3,1}.)

13
13 T=aabbabd P = bab, k=1 Intersection of M(Lk(P)) and M(Fac(T)).

14
14 Intersection aabbabd T D L (P,ba)=1 P=bab

15
15 Intersection aabbabd T D L (P,bab)=0 P=bab

16
16 Intersection aabbabd T PP=bab D L (P,bb)=1

17
17 Intersection aabbabd T P=bab D L (P,bbab)=1

18
18 Intersection aabbabd T P=bab D L (P,aab)=1

19
19 Intersection aabbabd T P=bab D L (P,ab)=1

20
20 Lemma The number of automaton is always lower than.

21
21 T=aabbabd P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}.

22
22 Thank you!

Similar presentations

Presentation is loading. Please wait....

OK

Addition 1’s to 20.

Addition 1’s to 20.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google