Introduction To Bioinformatics Tutorial 2
Local Alignment Tutorial 2
Usage: Spelling,.. Different Types: –Hamming –Levenshtein Algorithm –Naïve solution –Dynamic programming Edit Distance
Richard Bellman (1940) “Program” –Computer program? –Optimal Schedule Dynamic Programming
Conditions –Division to sub-problems possible –(Optimal) Sub-problem solution usable (many times?) –“Bottom-up” approach Dynamic Programming
Examples –Shortest path –Fibonacci Dynamic Programming
Usage: Spelling, Biology,… Compare sequences Similar sequence Ancestral origin Function … Edit Distance
Dynamic Programming algorithm for finding local matches between two sequences. What is a local match?: –It is a best-matching, highest-scoring region between two sequences. –It is a well conserved region between two sequences. Local Alignment
Alignment NnNn N1N1 M1M1 MmMm
NnNn N1N1 M1M1 MmMm [I,J] Best alignment M 1..I, N 1..J
Alignment All possible alignments encoded as path in matrix
The differences: 1.We can start a new match instead of extending a previous alignment. 2.Instead of looking only at the far corner, we look anywhere in the table for the best score Global vs Local Global Local Scoring System Match : +1 Mismatch: -2 Indel : -1
Local Alignment Scoring System –Match : +1N i =M j –Mismatch: -1N i =M j –Indel : -2 NnNn N1N1 M1M1 MmMm
Local Alignment Scoring System –Match : +1N i =M j –Mismatch : -1N i =M j –Indel : -2 NnNn N1N1 M1M1 MmMm
Local Alignment Scoring System –Match : +1 –Mismatch: -1 –Indel : -2 NnNn N1N1 M1M1 MmMm
Local Alignment Scoring System –Match : +1 –Mismatch: -1 –Indel : -2 NnNn N1N1 M1M1 MmMm N1-N1-
Local Alignment Scoring System –Match : +1 –Mismatch: -1 –Indel : -2 NnNn N1N1 M1M1 MmMm -M1-M1
Local Alignment Scoring System –Match : +1 –Mismatch: -1 –Indel : -2 NnNn N2N2 N1N1 M1M1 M2M2 MmMm N1 -.. M1M2..
Local Alignment Fill: 1.We fill the table like in global alignment, but we don’t allow negative numbers (turn every negative number to 0) 2.No arrows coming out from cells with a 0. Scoring System –Match : +1 –Mismatch: -1 –Indel : if M 2 =N 2 ; -1 if M 2 =N 2 -2 NnNn N2N2 N1N1 M1M1 M2M2 MmMm N 1 N 2.. M 1 M 2.. N M 1 M 2.. N 1 N 2.. M 1 -..
Local Alignment Trace: We trace back from the highest scoring cells. +1 if M 2 =N 2 ; -1 if M 2 =N 2 -2 NnNn N2N2 N1N1 M1M1 M2M2 MmMm N 1 N 2.. M 1 M 2.. N M 1 M 2.. N 1 N 2.. M 1 -..
Local Alignment Question: Will there be gaps at the start/end? NnNn N2N2 N1N1 M1M1 M2M2 MmMm
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A6 0 0 T 1 A 2 A 3 T 4 A 5
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A T 1 A 2 A 3 T 4 A 5
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A T 1 0 A 2 0 A 3 0 T 4 0 A 5 0
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A T 1 0 A 2 0 A 3 0 T 4 0 A 5 0 -T-T
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A T 1 0 A 2 0 A 3 0 T 4 0 A 5 0 T-T-
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A T 1 0? A 2 0 A 3 0 T 4 0 A 5 0
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A T 1 0? A 2 0 A 3 0 T 4 0 A 5 0 -T T- -T T- -T +1 -2
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A T 1 01 A 2 0 A 3 0 T 4 0 A 5 0
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A T A 2 0 A 3 0 T 4 0 A 5 0
0 0T 4 0A 3 0A T A6A6 A5A5 T4T4 C3C3 A2A2 T1T10
0A 5 0T 4 0A A T A6A6 A5A5 T4T4 C3C3 A2A2 T1T10
0A 5 0T A A T A6A6 A5A5 T4T4 C3C3 A2A2 T1T10
0A T A A T A6A6 A5A5 T4T4 C3C3 A2A2 T1T10
A T A A T A6A6 A5A5 T4T4 C3C3 A2A2 T1T10
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A A A T A
0 T1T1 A2A2 C3C3 T4T4 A5A5 A6A T A A T A Leave only paths from highest score
TAA TACTA TAATA
And Now… Global Alignment 1.We keep negative numbers. 2.Arrows coming out from any cell. 3.We trace back from right-bottom to left-top of the table. Scoring System –Match : +1 –Mismatch: -1 –Indel : if M 2 =N 2 ; -1 if M 2 =N 2 -2 NnNn N2N2 N1N1 M1M1 M2M2 MmMm N 1 N 2.. M 1 M 2.. N M 1 M 2.. N 1 N 2.. M 1 -..
A 5 T 4 A 3 A 2 T 1 0 A6A6 A5A5 T4T4 C3C3 A2A2 T1T10 Match: +1 Mismatch:-1 Indel:
A 5 T 4 A 3 A 2 T 1 0 A6A6 A5A5 T4T4 C3C3 A2A2 T1T10 Match: +1 Mismatch:-1 Indel:
A T A A T A6A6 A5A5 T4T4 C3C3 A2A2 T1T10
TACTAA TAATA- TACTAA TAAT-A
A 5 T 4 A 3 A 2 T 1 0 A6A6 A5A5 T4T4 C3C3 A2A2 T1T10