# Longest Common Subsequence (LCS) Dr. Nancy Warter-Perez.

## Presentation on theme: "Longest Common Subsequence (LCS) Dr. Nancy Warter-Perez."— Presentation transcript:

Longest Common Subsequence (LCS) Dr. Nancy Warter-Perez

Longest Common Subsequence2 Outline Longest Common Subsequence (LCS) Problem Definition Scoring Algorithm Printing Algorithm

Longest Common Subsequence3 Longest Common Subsequence (LCS) Problem Reference: Pevzner Can have insertion and deletions but no substitutions (no mismatches) Ex: V: ATCTGAT W:TGCATA LCS:TCTA

Longest Common Subsequence4 LCS Problem (cont.) Similarity score s i-1,j s i,j = max { s i,j-1 s i-1,j-1 + 1, if vi = wj

Longest Common Subsequence5 Example (1) V: ATCTGAT Sequence 1 n = length of V = 7 W:TGCATA Sequence 2 m = length of W = 6

Longest Common Subsequence6 Example (2) - Initialization Create two matrices with n+1 rows and m+1 columns (superimpose into 1 matrix for this example) Score matrix (s) – red in example Traceback matrix (b) – black in example Fill in first row and first column with zeros

Longest Common Subsequence7 Example (3) – Score (s = red, b = black) TGCATA 0000000 A0 T0 C0 T0 G0 A0 T0

Longest Common Subsequence8 Example (4) – Traceback (s = red, b = black) TGCATA 0000000 A00 ^ 1 \1 <1 \ T0 1 < 1 ^2 \2 < C01 ^ 2 \2 <2 ^ T01 \1 ^2 ^ 3 \3 < G01 ^2 \2 ^ 3 ^ A01 ^2 ^ 3 \3 ^4 \ T01 \2 ^ 3 ^4 ^

Longest Common Subsequence9 Indels – insertions and deletions (e.g., gaps) alignment of V and W V = rows of similarity matrix (vertical axis) W = columns of similarity matrix (horizontal axis) Space (gap) in W  (UP) insertion Space (gap) in V  (LEFT) deletion Match (no mismatch in LCS) (DIAG)

Longest Common Subsequence10 LCS(V,W) Algorithm for i = 0 to n si,0 = 0 for j = 1 to m s0,j = 0 for i = 1 to n for j = 1 to m if vi == wj si,j = si-1,j-1 + 1; bi,j = DIAG else if si-1,j >= si,j-1 si,j = si-1,j; bi,j = UP else si,j = si,j-1; bi,j = LEFT

Longest Common Subsequence11 Print-LCS(b,V,i,j) if i = 0 or j = 0 return if bi,j = DIAG PRINT-LCS(b, V, i-1, j-1) print vi else if bi,j = UP PRINT-LCS(b, V, i-1, j) else PRINT-LCS(b, V, i, j-1)

Longest Common Subsequence12 Programming Workshop and Homework – Implement LCS Workshop Write a Python script to implement LCS (V, W). Prompt the user for 2 sequences (V and W) and display b and s Add the Print-LCS(V, i, j) function to your Python script. The script should prompt the user for 2 sequences and print the longest common sequence.