Presentation is loading. Please wait.

Presentation is loading. Please wait.

Master Course MSc Bioinformatics for Health Sciences H15: Algorithms on strings and sequences Xavier Messeguer Peypoch (http://www.lsi.upc.es/~alggen)

Similar presentations


Presentation on theme: "Master Course MSc Bioinformatics for Health Sciences H15: Algorithms on strings and sequences Xavier Messeguer Peypoch (http://www.lsi.upc.es/~alggen)"— Presentation transcript:

1 Master Course MSc Bioinformatics for Health Sciences H15: Algorithms on strings and sequences Xavier Messeguer Peypoch (http://www.lsi.upc.es/~alggen) Dep. de Llenguatges i Sistemes Informàtics CEPBA-IBM Research Institute Universitat Politècnica de Catalunya

2 Contents 1. (Exact) String matching of one pattern 2. (Exact) String matching of many patterns 4. Approximate string matching (Dynamic programming) 5. Pairwise and multiple alignment 6. Suffix trees 3. Extended string matching and regular expressions

3 Master Course Second lecture: First part: Extended string matching

4 There are characters in the text that represent sets of simbols 1. Classes of characters in the tetx. There are characters in the text that represent sets of simbols 2. Classes of characters in the pattern. There are classes of characters represented by one Symbol. For instace the IUPAC code for the DNA alphabet is: R = {G,A} Y = {T,C} K = {G,T} M = {A,C} S = {G,C} W = {A,T} B = {G,T,C } D = {G,A,T} H = {A,C,T} V = {G,C,A} N = {A,G,C,T} (any)

5 Classes in the text Algorismes més eficients (Navarro & Raffinot) 2 4 8 16 32 64 128 256 64 32 16 8 4 2 |  | Long. patró Horspool BNDM BOM w

6 Classes in the text :Horspool example Given the pattern ATGTA the shift table is: A 4 C 5 G 2 T 1 R ? … N ?

7 Classes in the text :Horspool example Suposem que el patró és ATGTA La taula de salts seria: A 4 C 5 G 2 T 1 R 2 … N ?

8 Classes in the text :Horspool example Given the pattern ATGTA and the shift table: A 4 C 5 G 2 T 1 R 2 … N 1 Given the taxt :G T A R T R N A A G G A … A T G T A

9 Classes in the text :Horspool example Given the pattern ATGTA and the shift table: A 4 C 5 G 2 T 1 R 2 … N 1 IGiven the text :G T A R T R N A A G G A... A T G T A …

10 Classes in the text Algorismes més eficients (Navarro & Raffinot) 2 4 8 16 32 64 128 256 64 32 16 8 4 2 |  | Long. patró Horspool BNDM BOM BNDM : Backward Nondeterministic Dawg Matching BOM : Backward Oracle Matching w

11 Alg. Cerca exacta d’un patró (text on-line) Algorismes més eficients (Navarro & Raffinot) 2 4 8 16 32 64 128 256 64 32 16 8 4 2 |  | Long. patró Horspool BNDM BOM BNDM : Backward Nondeterministic Dawg Matching BOM : Backward Oracle Matching w

12 Classes in the text: BOM Com es determina la següent posició de la finestra? Com fa la comparació? Text : Patró : Autòmata: Factor Oracle Comproba si el sufix és factor del patró Però primer analitzem com fa la comparació…

13 Classes in the text: BOM example Es construeix l’autòmata del patró invers: Suposem que el patró és ATGTATG I la cerca sobre el text :G T A R T R N A A T G… A T G T A T G Com fa la comparació? GGATT AT T A G No és possible cap millora!

14 Alg. Cerca exacta de molts patrons 5 10 15 20 25 30 35 40 45 8 4 2 |  | Wu-Manber SBOM Long. mínima (5 mots) 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (10 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (1000 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (100 mots) Ad AC

15 Classes in the text: Set Horspool Search for the patterns ATGTATG,TATG,ATAAT,ATGTG T A A G G A T T T T G A A A A T In the text: ARTGNCTATGTGACA… <it’s not possible any improvment!

16 Classes in the text 5 10 15 20 25 30 35 40 45 8 4 2 |  | Wu-Manber SBOM Long. mínima (5 mots) 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (10 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (1000 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (100 mots) Ad AC

17 Classes in the pattern Algorismes més eficients (Navarro & Raffinot) 2 4 8 16 32 64 128 256 64 32 16 8 4 2 |  | Long. patró Horspool BNDM BOM w

18 Classes in the text 5 10 15 20 25 30 35 40 45 8 4 2 |  | Wu-Manber SBOM Long. mínima (5 mots) 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (10 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (1000 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (100 mots) Ad AC

19 Alg. Cerca exacta de molts patrons 5 10 15 20 25 30 35 40 45 8 4 2 |  | Wu-Manber SBOM Long. mínima (5 mots) 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (10 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (1000 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (100 mots) Ad AC

20 Alg. Cerca exacta de molts patrons 5 10 15 20 25 30 35 40 45 8 4 2 |  | Wu-Manber SBOM Long. mínima (5 mots) 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (10 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (1000 mots) Ad AC 5 10 15 20 25 30 35 40 45 8 4 2 Wu-Manber SBOM (100 mots) Ad AC

21 Master Course Second lecture: Second part: Regular expressions matching

22 Expressions regulars Una expressió regular ℛ és una cadena sobre Σ U { ε, |, ·, *, (, ) } definida recursivament com: ε és una expressió regular Un caràcter de Σ és una expressió regular ( ℛ ) és una expressió regular ℛ 1 · ℛ 2 és una expressió regular ℛ * és una expressió regular ℛ 1 | ℛ 2 és una expressió regular

23 Llenguatge regular El llenguatge representat per una expressió regular és el conjunt dels mots que es poden construir a partir de l’expressió regular. El problema de buscar una expressió regular dins el text és el de buscar tots els factors que pertanyen al respectiu llenguatge regular.

24 Master Course Second lecture: Third part: Approximate string matching

25 For instance, given the sequence CTACTACTACGTCTATACTGATCGTAGCTACTACATGC search for the pattern ACTGA allowing one error… … but what is the meaning of “one error”?

26 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one d(ACT,ACT)= d(ACT,AC)=d(ACT,C)= d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT Indel

27 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= Indel

28 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= 3 d(AC,ATC)=1 d(ACTTG,ATCTG)=2 Indel

29 Edit distance and alignment of strings ACT and ACT : ACT ACT ACTTG and ATCTG: ACT and AC: ACT AC- ACTTG ATCTG ACT - TG A - TCTG Given d(ACT,ACT)=0 d(ACT,AC)=1 d(ACTTG,ATCTG)=2 which is the best alignment in every case? The Edit distance is related with the best alignment of strings

30 Edit distance and alignment of strings But which is the distance between the strings ACGCTATGCTATACG and ACGGTAGTGACGC? … and the best alignment between them? 1966 was the first time this problem was discussed… and the algorithm was proposed in 1968,1970,… using the technique called “Dynamic programming”

31 Approximate string matching For instance, given the sequence CTACTACTACGTCTATACTGATCGTAGCTACTACATGC search for the pattern ACTGA allowing one error… … but what is the meaning of “one error”?

32 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one d(ACT,ACT)= d(ACT,AC)=d(ACT,C)= d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT Indel

33 Approximate string matching For instance, given the sequence CTACTACTACGTCTATACTGATCGTAGCTACTACATGC search for the pattern ACTGA allowing one error… … but what is the meaning of “one error”?

34 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one d(ACT,ACT)= d(ACT,AC)=d(ACT,C)= d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT Indel

35 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= Indel

36 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= 3 d(AC,ATC)=1 d(ACTTG,ATCTG)=2 Indel

37 Edit distance and alignment of strings ACT and ACT : ACT ACT ACTTG and ATCTG: ACT and AC: ACT AC- ACTTG ATCTG ACT - TG A - TCTG Given d(ACT,ACT)=0 d(ACT,AC)=1 d(ACTTG,ATCTG)=2 which is the best alignment in every case? The Edit distance is related with the best alignment of strings

38 Edit distance and alignment of strings But which is the distance between the strings ACGCTATGCTATACG and ACGGTAGTGACGC? … and the best alignment between them? 1966 was the first time this problem was discussed… and the algorithm was proposed in 1968,1970,… using the technique called “Dynamic programming”

39 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= Indel

40 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= 3 d(AC,ATC)=1 d(ACTTG,ATCTG)=2 Indel

41 Edit distance and alignment of strings ACT and ACT : ACT ACT ACTTG and ATCTG: ACT and AC: ACT AC- ACTTG ATCTG ACT - TG A - TCTG Given d(ACT,ACT)=0 d(ACT,AC)=1 d(ACTTG,ATCTG)=2 which is the best alignment in every case? The Edit distance is related with the best alignment of strings

42 Edit distance and alignment of strings But which is the distance between the strings ACGCTATGCTATACG and ACGGTAGTGACGC? … and the best alignment between them? 1966 was the first time this problem was discussed… and the algorithm was proposed in 1968,1970,… using the technique called “Dynamic programming”

43 Edit distance and alignment of strings C T A C T A C T A C G T A C T G A

44 Edit distance and alignment of strings C T A C T A C T A C G T A C T G A

45 Edit distance and alignment of strings C T A C T A C T A C G T A C T G A The cell contains the distance between AC and CTACT.

46 Approximate string matching For instance, given the sequence CTACTACTACGTCTATACTGATCGTAGCTACTACATGC search for the pattern ACTGA allowing one error… … but what is the meaning of “one error”?

47 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one d(ACT,ACT)= d(ACT,AC)=d(ACT,C)= d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT Indel

48 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= Indel

49 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= 3 d(AC,ATC)=1 d(ACTTG,ATCTG)=2 Indel

50 Edit distance and alignment of strings ACT and ACT : ACT ACT ACTTG and ATCTG: ACT and AC: ACT AC- ACTTG ATCTG ACT - TG A - TCTG Given d(ACT,ACT)=0 d(ACT,AC)=1 d(ACTTG,ATCTG)=2 which is the best alignment in every case? The Edit distance is related with the best alignment of strings

51 Edit distance and alignment of strings But which is the distance between the strings ACGCTATGCTATACG and ACGGTAGTGACGC? … and the best alignment between them? 1966 was the first time this problem was discussed… and the algorithm was proposed in 1968,1970,… using the technique called “Dynamic programming”

52 Approximate string matching For instance, given the sequence CTACTACTACGTCTATACTGATCGTAGCTACTACATGC search for the pattern ACTGA allowing one error… … but what is the meaning of “one error”?

53 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one d(ACT,ACT)= d(ACT,AC)=d(ACT,C)= d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT Indel

54 Approximate string matching For instance, given the sequence CTACTACTACGTCTATACTGATCGTAGCTACTACATGC search for the pattern ACTGA allowing one error… … but what is the meaning of “one error”?

55 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one d(ACT,ACT)= d(ACT,AC)=d(ACT,C)= d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT Indel

56 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= Indel

57 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= 3 d(AC,ATC)=1 d(ACTTG,ATCTG)=2 Indel

58 Edit distance and alignment of strings ACT and ACT : ACT ACT ACTTG and ATCTG: ACT and AC: ACT AC- ACTTG ATCTG ACT - TG A - TCTG Given d(ACT,ACT)=0 d(ACT,AC)=1 d(ACTTG,ATCTG)=2 which is the best alignment in every case? The Edit distance is related with the best alignment of strings

59 Edit distance and alignment of strings But which is the distance between the strings ACGCTATGCTATACG and ACGGTAGTGACGC? … and the best alignment between them? 1966 was the first time this problem was discussed… and the algorithm was proposed in 1968,1970,… using the technique called “Dynamic programming”

60 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= d(AC,ATC)= d(ACTTG,ATCTG)= Indel

61 Edit distance We accept three types of errors: The edit distance d between two strings is the minimum number of substitutions,insertions and deletions needed to transform the first string into the second one 3. Deletion: ACCGTGAT ACCGGAT 2. Insertion: ACCGTGAT ACCGATGAT 1. Mismatch: ACCGTGAT ACCGAGAT d(ACT,ACT)=0 d(ACT,AC)=1d(ACT,C)=2 d(ACT,)= 3 d(AC,ATC)=1 d(ACTTG,ATCTG)=2 Indel

62 Edit distance and alignment of strings ACT and ACT : ACT ACT ACTTG and ATCTG: ACT and AC: ACT AC- ACTTG ATCTG ACT - TG A - TCTG Given d(ACT,ACT)=0 d(ACT,AC)=1 d(ACTTG,ATCTG)=2 which is the best alignment in every case? The Edit distance is related with the best alignment of strings

63 Edit distance and alignment of strings But which is the distance between the strings ACGCTATGCTATACG and ACGGTAGTGACGC? … and the best alignment between them? 1966 was the first time this problem was discussed… and the algorithm was proposed in 1968,1970,… using the technique called “Dynamic programming”

64 Edit distance and alignment of strings C T A C T A C T A C G T A C T G A

65 Edit distance and alignment of strings C T A C T A C T A C G T A C T G A

66 Edit distance and alignment of strings C T A C T A C T A C G T A C T G A The cell contains the distance between AC and CTACT.

67 Edit distance and alignment of strings C T A C T A C T A C G T A C T G A ?

68 Edit distance and alignment of strings C T A C T A C T A C G T 0 A C T G A ?

69 Edit distance and alignment of strings C T A C T A C T A C G T 0 1 A C T G A - C ?

70 Edit distance and alignment of strings C T A C T A C T A C G T 0 1 2 A C T G A - - CT ?

71 Edit distance and alignment of strings C T A C T A C T A C G T 0 1 2 3 4 5 6 7 8 … A C T G A - - - - - - CTACTA

72 Edit distance and alignment of strings C T A C T A C T A C G T 0 1 2 3 4 5 6 7 8 … A ? C ? T ? G A

73 Edit distance and alignment of strings C T A C T A C T A C G T 0 1 2 3 4 5 6 7 8 … A 1 C 2 T 3 G… A ACT - - -

74 C T A C T A C T A C G T 0 1 2 3 4 5 6 7 8 … A 1 C 2 T 3 G A C T A C T A C T A C G T A C T G A Edit distance and alignment of strings BA(AC,CTA) - C BA(A,CTA) CCCC BA(A,CTAC) C - BA(AC,CTAC)= best d(AC,CTAC)=min d(AC,CTA)+1 d(A,CTA) d(A,CTAC)+1

75 C T A C T A C T A C G T A C T G A Edit distance and alignment of strings d(A,CTAC)+1 d(AC,CTACT)=minimum d(A,CTA) …..+1 d(AC,CTA)+1 C T A C T A C T A C G T 0 1 2 3 4 5 6 7 8 … A 1 C 2 T 3 G A

76 Edit distance and alignment of strings Connect to http://alggen.lsi.upc.es/docencia/ember/leed/Tfc1.htm and use the global method.

77 Edit distance and alignment of strings How this algorithm can be applied to the approximate search? to the K-approximate string searching?

78 K-approximate string searching C T A C T A C T A C G T A C T G G T G A A … A C T G A This cell …

79 K-approximate string searching C T A C T A C T A C G T A C T G G T G A A … A C T G A This cell gives the distance between (ACTGA, CT…GTA)… …but we only are interested in the last characters

80 K-approximate string searching C T A C T A C T A C G T A C T G G T G A A … A C T G A This cell gives the distance between (ACTGA, CT…GTA)… …but we only are interested in the last characters

81 Master Course Second lecture: Fourth part: Pairwise and multiple alignment

82 Bioinformatics Pairwise and multiple alignment

83 Pairwise alignment Edit distance: match=0mismatch=1 indel=1 d(A,CTAC)+1 d(AC,CTACT)=minimum d(A,CTA)….+1 d(AC,CTA)+1 Similarity: match=1 mismatch=-1indel=-2 s(A,CTAC)-2 s(AC,CTACT)=maximum s(A,CTA) 1 s(AC,CTA)-2 - +

84 C T A C T A C T A C G T A C T C T A C T A C T A C G T 0 -2-4-6 … A-2 C-4 T-6 Similarity: match=1 mismatch=-1indel=-2 Pairwise alignment s(A,CTAC)-2 s(AC,CTACT)=maximum s(A,CTA) 1 s(AC,CTA)-2 - +

85 Pairwise alignment Connect to http://alggen.lsi.upc.es Links to TEACHING EMBER LePA

86 A C A __ Pairwise to multiple alignment What happens with three strings? Let n be their lenght, then the cost becomes S3S3 S2S2 S1S1 O(n 3 )O(2 3 )O(3 2 ) And with k strings? O(n k 2 k k 2 )

87 Multiple alignment Programs of multialignment use different heuristics: n Clustal (Progressive alignment) Clustal http://www.ebi.ac.uk/clustalw n TCoffee (Progressive alignment + data bases) TCoffee http://igs-server.cnrs-mrs.fr/Tcoffee_cgi/index.cgi n HMM (Hidden Markov Models)

88 Multiple alignment Connect to http://alggen.lsi.upc.es/ and follow the links TEACHING EMBER.


Download ppt "Master Course MSc Bioinformatics for Health Sciences H15: Algorithms on strings and sequences Xavier Messeguer Peypoch (http://www.lsi.upc.es/~alggen)"

Similar presentations


Ads by Google