Download presentation

Presentation is loading. Please wait.

Published byHannah McNally Modified over 3 years ago

1
1 Approximate string matching using factor automata J. Holub and B. Melichar Theoretical Computer Science vol.249 p.305-311 Speaker: L. C. Chen Advisor: R. C. T. Lee

2
2 Problem D L (P, X) between strings P and X is the minimum number of edit operations (substitution, insertion and deletion) needed to convert string P to X. Given a text T, a pattern P, and an integer k, k m n, approximate string matching can be defined as determining whether string X occurs in text T such that edit distance D L (P, X) between pattern P and string X is less than or equal to k.

3
3 An example of Edit Distance To convert P into T: P = abcde T = bcfeg P = abcde T = bcfeg P 1 = bcde P 2 = bcfe f g Delete a Substitute d with f Insert

4
4 Basic definition Fac(T): a set contains all the substrings of text T. A nondeterministic finite automaton (NFA) is a five- tuple M=(Q, Σ, δ, q 0, F), where Q is a finite set of states, Σ is a finite input alphabet, δ is a mapping from Q×(Σ {ε}) into the set of subsets of Q, q 0 Q is an initial state, and F Q is a set of final states. M(Fac(T)): a factor automaton accepts Fac(T).

5
5 T=aabbabd Fac(T)={a,b,d,aa,ab,bb,ba,bd,aab,abb,bba,bab,abd,aabb,abba,bbab,babd aabba,abbab,bbabd,aabbab,abbabd,aabbabd} Factor automaton Factor automation M(Fac(T)): a deterministic finite automaton (DFA) accepts all substrings of the given text T.

6
6 A suffix tree can also be used to recognize all substrings of T=aabbabd, Fac(T)={a,b,d,aa,ab,bb,ba,bd,aab,abb,bba,bab,abd,aabb,abba,bbab,babd aabba,abbab,bbabd,aabbab,abbabd,aabbabd}

7
7 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. One matched, 0 error. One matched, one error. Three matched, 0 error.

8
8 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. Recognize ab

9
9 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. Recognize aab

10
10 P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}. Recognize bbab

11
11 Definition Let An automaton for intersection of M 1 and M 2 is an automaton

12
12 T=aabbabd P = bab, k=1 Intersection of M(Lk(P)) and M(Fac(T)). Solutions : {ba, bab, bb, bbab, aab, ab} (All end with {3,0} or {3,1}.)

13
13 T=aabbabd P = bab, k=1 Intersection of M(Lk(P)) and M(Fac(T)).

14
14 Intersection aabbabd T D L (P,ba)=1 P=bab

15
15 Intersection aabbabd T D L (P,bab)=0 P=bab

16
16 Intersection aabbabd T PP=bab D L (P,bb)=1

17
17 Intersection aabbabd T P=bab D L (P,bbab)=1

18
18 Intersection aabbabd T P=bab D L (P,aab)=1

19
19 Intersection aabbabd T P=bab D L (P,ab)=1

20
20 Lemma The number of automaton is always lower than.

21
21 T=aabbabd P = bab, k=1. The finite automaton M(L k (P)) accepts L k (P). Lk(P)={ab, bb, ba, aab, bab, dab, bbb, bdb baa, bad, bbab, bdab, baab, badb}.

22
22 Thank you!

Similar presentations

OK

CSCI 3130: Formal Languages and Automata Theory Tutorial 5

CSCI 3130: Formal Languages and Automata Theory Tutorial 5

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on latest technology free download Ppt on national democratic alliance Ppt on review of literature template Ppt on human chromosomes images Ppt on area of equilateral triangle Ppt on net etiquettes of life Ppt on transportation and excretion in human body Ppt on internet services download Download ppt on turbo-generator integrated gas energy recovery system Ppt on success after failure