Download presentation

Presentation is loading. Please wait.

Published byKayla Willis Modified over 3 years ago

1
An Extension of the String-to- String Correction Problem Roy Lowrance and Robert A. Wagner Journal of the ACM, vol. 22, No. 2, April 1975, pp. 177-183. Speaker:

2
Edit Distance Three edit operations: –Substitution abcd -> aacd ( change b to a ) –Insertion abcd -> abacd ( insert an a ) –Deletion abcd -> abd ( delete c ) Given two strings T and P, The problem is to determine the minimum number of edit operations to transform T into P. Note: For clarity, we consider the cost of all edit operations are same.

3
saturday 012345678 s 101234567 u 211223456 n 322233456 d 433334345 a 543444434 y 654455543 saturday sunday d[i, j] = min( d[i-1, j] + 1, d[i, j-1] + 1, d[i-1, j-1] + cost(A[i]->B[j]) ) This example is copied from Wikipedia

4
The Problem This paper extends the set of edit operations to include the operation of interchanging two adjacent characters. –Swap Example: T: a b c d P: c d a a b c d -> a c d -> c a d -> c d a

5
Trace A trace is a graphical specification of how edit operations apply to each character in the two strings. Example: T: a b c d P: c d a

6
Important Properties The edit operations in following cases can be substituted by other edit operations. abc bca a...b b c a a b b a

7
abc bca a b b c a a b b a abc bca a b b c a b b c 2 swaps insertion + deletion deletion + substitution 2 substitution swap + substitution swap + K deletion + L insertion a...a b b a K L a trace with lower cost or

8
The Algorithm............ a... b........ b...... a ii jj d[i, j] = min( d[i-1, j] + 1, d[i, j-1] + 1, d[i-1, j-1] + cost(A[i]->B[j]), d[i'-1, j'-1] + (i-i'-1) + (j-j'-1) + 1 ) i'i'i j' j

9
Summary With a simple preprocessing on |T| and |P|, then the problem can be solved by dynamic programming in time O(|T| |P|). If we allow edit operations to have different cost Insertion (cost W I ) Deletion (cost W D ) Swap (cost W S ) Substitution (cost W C ) then the algorithm works if 2 W S W I + W D.

Similar presentations

OK

Relation Extraction William Cohen 10-18. Kernels vs Structured Output Spaces Two kinds of structured learning: –HMMs, CRFs, VP-trained HMM, structured.

Relation Extraction William Cohen 10-18. Kernels vs Structured Output Spaces Two kinds of structured learning: –HMMs, CRFs, VP-trained HMM, structured.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on mutual funds Ppt on taj mahal agra Ppt on display advertising Ppt on namespace in c++ Mrna differential display ppt online Ppt on states of matter grade 5 Ppt on wireless communication Download ppt on pulse code modulation tutorial Ppt on area and perimeter of rectangle Ppt on obesity prevention coalition