Presentation is loading. Please wait.

Presentation is loading. Please wait.

Complement to lecture 11 : Levenshtein distance algorithm

Similar presentations


Presentation on theme: "Complement to lecture 11 : Levenshtein distance algorithm"— Presentation transcript:

1 Complement to lecture 11 : Levenshtein distance algorithm

2 Levenshtein distance Also called “Edit distance”
Purpose: to compute the smallest set of basic operations Insertion Deletion Replacement that will turn one string into another Intro. to Programming, lecture 11 (complement): Levenshtein 2

3 M I I C H H A A E L J A C K S O N E N D S H Levenshtein distance 1 2 3
“Michael Jackson” to “Mendelssohn” M I I C H H A A E L J A C K S O N E N D S H S D S S S D D D D I Operation Distance 1 2 3 4 5 6 7 8 9 10 Intro. to Programming, lecture 11 (complement): Levenshtein 3

4 Levenshtein distance algorithm
levenshtein (source, target : STRING): INTEGER -- Minimum number of operations to turn source into target local distance : ARRAY_2 [INTEGER] i, j, del, ins, subst : INTEGER do create distance.make (source.count, target.count) from i := 0 until i > source.count loop distance [i, 0] := i ; i := i + 1 end from j := 0 until j > target.count loop distance [0, j ] := j ; j := j + 1 end -- (Continued) Indexed from zero Intro. to Programming, lecture 11 (complement): Levenshtein 4

5 Levenshtein, continued
from i := 1 until i > source.count loop from j := 1 until j > target.count invariant loop if source [i ] = target [ j ] then distance [i, j ] := distance [ i -1, j -1] else deletion := distance [i -1, j ] insertion := distance [i , j - 1] substitution := distance [i - 1, j - 1] distance [i, j ] := minimum (deletion, insertion, substitution) + 1 end j := j + 1 i := i + 1 Result := distance (source.count, target.count) -- For all p : 0 .. i, q : 0 .. j –1, we can turn source [1 .. p ] -- into target [1 .. q ] in distance [p, q ] operations s [m .. n ]: substring of s with items at positions k such that m  k  n (empty if m > n) Intro. to Programming, lecture 11 (complement): Levenshtein 5

6 Target B E A T L E S Source  1 2 3 4 5 6 7 I I I I I I I 1 2 3 4 5 6 7 D K I I I I I I B 1 1 1 2 3 4 5 6 D D K K I I I I E 2 2 1 1 2 3 4 5 D D K S S S K I I I E 3 3 2 1 1 2 3 3 ? 4 D D D S D K S I I I T 4 4 3 2 2 1 2 3 4 D D D S D D S S S I I H 5 5 4 3 3 2 2 3 4 I K D S Keep Insert Delete Substitute

7 B E A T L E S B E E T H 1 2 3 4 5 6 7 Keep B,1 1 1 2 3 4 5 6 Keep E,2
1 2 3 4 5 6 7 Keep B,1 B 1 1 2 3 4 5 6 Keep E,2 E 2 1 1 2 3 4 5 Subst EA,3 E 3 2 1 1 2 3 3 4 Keep T,4 T 4 3 2 2 1 2 3 4 Subst HS,7 Ins L,5 Ins E,6 H 5 4 3 3 2 2 3 4 Keep Insert Delete Substitute


Download ppt "Complement to lecture 11 : Levenshtein distance algorithm"

Similar presentations


Ads by Google