Presentation is loading. Please wait.

Presentation is loading. Please wait.

Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Chair of Software Engineering Complement to lecture 11 : Levenshtein.

Similar presentations


Presentation on theme: "Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Chair of Software Engineering Complement to lecture 11 : Levenshtein."— Presentation transcript:

1 Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Chair of Software Engineering Complement to lecture 11 : Levenshtein distance algorithm

2 2 Levenshtein distance Also called “Edit distance” Purpose: to compute the smallest set of basic operations  Insertion  Deletion  Replacement that will turn one string into another Intro. to Programming, lecture 11 (complement): Levenshtein

3 3 Levenshtein distance MICHAELJACKSON ENDSH Operation SDSSSDDDDI “Michael Jackson” to “Mendelssohn” Distance 12345678910 0 I H A

4 4 Levenshtein distance algorithm levenshtein (source, target : STRING): INTEGER -- Minimum number of operations to turn source into target local distance : ARRAY_2 [INTEGER] i, j, del, ins, subst : INTEGER do create distance. make (source. count, target. count) from i := 0 until i > source. count loop distance [i, 0] := i ; i := i + 1 end from j := 0 until j > target. count loop distance [0, j ] := j ; j := j + 1 end -- (Continued) Indexed from zero Intro. to Programming, lecture 11 (complement): Levenshtein

5 5 Levenshtein, continued from i := 1 until i > source. count loop from j := 1 until j > target. count invariant loop if source [i ] = target [ j ] then distance [i, j ] := distance [ i -1, j -1] else deletion := distance [i -1, j ] insertion := distance [i, j - 1] substitution := distance [i - 1, j - 1] distance [i, j ] := minimum (deletion, insertion, substitution) + 1 end j := j + 1 end i := i + 1 end Result := distance (source. count, target. count) end Intro. to Programming, lecture 11 (complement): Levenshtein -- For all p : 0.. i, q : 0.. j –1, we can turn source [1.. p ] -- into target [1.. q ] in distance [p, q ] operations s [m.. n ]: substring of s with items at positions k such that m  k  n (empty if m > n)

6 6 BEATLE S B E E T H 3 0 1 2567 4 0 1 2 3 5 4 30125674 1 2 3 5 4 0 I 2 3 I I 4 5 6 III I Insert Keep K K D Delete Substitute S 1 D 1 1 0 K 1 2 I I 3 4 5 III D 21 S ? 2 I S 3 I 34 I D 3 D 2 D 2 1 K 2 I 3 I 4 S D 4 D 33 S D 22 S 3 I 4 S KS K D S I S I IIIIIII D D D D D

7 7 BEATLE S B E E T H 30125674 1 2 3 5 4 2 3 4 5 6 Insert Keep Delete Substitute 1 1 01 2 3 4 5 2234 3 22 2 3 4 4332234 0 1 13 1 Keep B,1 Keep E,2 Subst E  A,3 Keep T,4 Ins L,5 Ins E,6 Subst H  S,7


Download ppt "Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Chair of Software Engineering Complement to lecture 11 : Levenshtein."

Similar presentations


Ads by Google