Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tècniques i Eines Bioinformàtiques

Similar presentations


Presentation on theme: "Tècniques i Eines Bioinformàtiques"— Presentation transcript:

1 Tècniques i Eines Bioinformàtiques
23/04/2019 Bioinformatics, Sequence and Genome Analysis David W. Mount Flexible Pattern Matching in Strings (2002) Gonzalo Navarro and Mathieu Raffinot Algorithms on strings (2001) M. Crochemore, C. Hancart and T. Lecroq

2 Algorismes i estructures eficients de cerca
23/04/2019 String matching: definition of the problem (text,pattern) Exact matching: depends on what we have: text or patterns The patterns ---> Data structures for the patterns 1 pattern ---> The algorithm depends on |p| and || k patterns ---> The algorithm depends on k, |p| and || The text ----> Data structure for the text (suffix tree, ...)

3 Exact string matching: one pattern
23/04/2019 How does the string algorithms made the search? For instance, given the sequence CTACTACTACGTCTATACTGATCGTAGCTACTACATGC search for the pattern ACTGA. and for the pattern TACTACGGTATGACTAA As you have seen this morning ....

4 Exact string matching: Brute force algorithm
23/04/2019 Example: Given the pattern ATGTA, the search is G T A C T A G A G G A C G T A T G T A C T G ... A T G T A A T G T A A T G T A A T G T A A T G T A A T G T A As you have seen this morning ....

5 Exact string matching: Brute force algorithm
23/04/2019 Which is the next position of the window? How the comparison is made? Text : Pattern : From left to right: prefix Text : Pattern : As you have seen this morning .... The window is shifted only one cell

6 Exact string matching: one pattern
23/04/2019 How does the matching algorithms made the search? There is a sliding window along the text against which the pattern is compared: Pattern : Text : At each step the comparison is made and the window is shifted to the right. As you have seen this morning .... Which are the facts that differentiate the algorithms? How the comparison is made. The length of the shift.

7 Exact string matching: one pattern (text on-line)
23/04/2019 Experimental efficiency (Navarro & Raffinot) BNDM : Backward Nondeterministic Dawg Matching | | BOM : Backward Oracle Matching 64 32 16 Horspool 8 BOM BNDM 4 2 Long. pattern w

8 Horspool algorithm How the comparison is made?
23/04/2019 Which is the next position of the window? How the comparison is made? Text : Pattern : Sufix search Pattern : Text : a As you have seen this morning .... Shift until the next ocurrence of “a” in the pattern: a We need a preprocessing phase to construct the shift table.

9 Horspool algorithm : example
23/04/2019 Given the pattern ATGTA The shift table is: A C G T As you have seen this morning ....

10 Horspool algorithm : example
23/04/2019 Given the pattern ATGTA The shift table is: A 4 C G T As you have seen this morning ....

11 Horspool algorithm : example
23/04/2019 Given the pattern ATGTA The shift table is: A 4 C 5 G T As you have seen this morning ....

12 Horspool algorithm : example
23/04/2019 Given the pattern ATGTA The shift table is: A 4 C 5 G 2 T As you have seen this morning ....

13 Horspool algorithm : example
23/04/2019 Given the pattern ATGTA The shift table is: A 4 C 5 G 2 T 1 As you have seen this morning ....

14 Horspool algorithm : example
23/04/2019 Given the pattern ATGTA The shift table is: A 4 C 5 G 2 T 1 The searching phase: G T A C T A G A G G A C G T A T G T A C T G ... A T G T A A T G T A A T G T A A T G T A A T G T A A T G T A As you have seen this morning ....

15 Exemple algorisme de Horspool
23/04/2019 Given the pattern ATGTA The shift table is: A 4 C 5 G 2 T 1 The searching phase: G T A C T A G A G G A C G T A T G T A C T G ... A T G T A A T G T A A T G T A A T G T A A T G T A A T G T A As you have seen this morning .... A T G T A

16 Qüestions sobre l’algorisme de Horspool
23/04/2019 Given the pattern ATGTA, the shift table is A 4 C 5 G 2 T 1 Given a random text over an equally likely probability distribution (EPD): 1.- Determine the expected shift of the window. And, if the PD is not equally likely? 2.- Determine the expected number of shifts assuming a text of length n. As you have seen this morning .... 3.- Determine the expected number of comparisons in the suffix search phase

17 Exact string matching: one pattern (text on-line)
23/04/2019 Experimental efficiency (Navarro & Raffinot) BNDM : Backward Nondeterministic Dawg Matching | | BOM : Backward Oracle Matching 64 32 16 Horspool 8 BOM BNDM 4 2 Long. pattern w

18 BNDM algorithm How the comparison is made?
23/04/2019 Which is the next position of the window ? How the comparison is made? Text : Pattern : Search for suffixes of T that are factors of Once the next character x is read D3 = D2<<1 & B(x) B(x): mask of x in the pattern P. For instance, if B(x) = ( ) D = ( ) & ( ) = ( ) x That is denoted as D2 = Depends on the value of the leftmost bit of D As you have seen this morning ....

19 BNDM algorithm: exaple
23/04/2019 Given the pattern ATGTA The mask of characters is: B(A) = ( ) B(C) = ( ) B(G) = ( ) B(T) = ( ) The searching phase: G T A C T A G A G G A C G T A T G T A C T G ... A T G T A A T G T A A T G T A A T G T A D1 = ( ) D2 = ( ) & ( ) = ( ) D1 = ( ) D2 = ( ) & ( ) = ( ) As you have seen this morning .... D1 = ( ) D2 = ( ) & ( ) = ( ) D3 = ( ) & ( ) = ( ) D4 = ( ) & ( ) = ( )

20 Exemple algorisme BNDM
23/04/2019 Given the pattern ATGTA The mask of characters is : The searching phase: G T A C T A G A G G A C G T A T G T A C T G ... A T G T A B(A) = ( ) B(C) = ( ) B(G) = ( ) B(T) = ( ) D1 = ( ) A T G T A D2 = ( ) & ( ) = ( ) D3 = ( ) & ( ) = ( ) D4 = ( ) & ( ) = ( ) D5 = ( ) & ( ) = ( ) As you have seen this morning .... D6 = ( ) & ( * * * * * ) = ( ) Trobat!

21 Exemple algorisme BNDM
23/04/2019 Given the pattern ATGTA The mask of characters is : B(A) = ( ) B(C) = ( ) B(G) = ( ) B(T) = ( ) How the shif is determined? The searching phase: G T A C T A G A A T A C G T A T G T A C T G ... A T G T A A T G T A A T G T A D1 = ( ) D2 = ( ) & ( ) = ( ) As you have seen this morning .... D1 = ( ) D2 = ( ) & ( ) = ( ) D3 = ( ) & ( ) = ( )

22 Alg. Cerca exacta d’un patró (text on-line)
Algorismes més eficients (Navarro & Raffinot) BNDM : Backward Nondeterministic Dawg Matching | | BOM : Backward Oracle Matching 64 32 16 Horspool 8 BOM BNDM 4 2 Long. patró w

23 Autòmata Factor Oracle: propietats
23/04/2019 Factor Oracle del mot G T A T G T A G A T Tots els estats són finals ==> Reconeix tots els factors …. i més G A T L’estat reconeix tots els factors que acaben a la quarta lletra T que no eren reconeguts: GTAT, TAT, AT perque T ja ho era. As you have seen this morning .... Hip: reconeix tots factors de GTA Reconeix tots els factors de de les primeres 4 lletres

24 Autòmata Factor Oracle: algorisme
23/04/2019 Algorisme: per a i=1 fins p fer Afegir transicions que reconeguin factors acabats a i; ? As you have seen this morning ....

25 Autòmata Factor Oracle: algorisme
23/04/2019 Que passa si el següent caràcter existeix? T As you have seen this morning ....

26 Autòmata Factor Oracle: algorisme
23/04/2019 Que passa si el següent caràcter no existeix? T T As you have seen this morning ....

27 Autòmata Factor Oracle: exemple d’algorisme
23/04/2019 G A T i reconeix mots que no són factors com GTGTA. Però, si no el reconeix ==> no és factor! As you have seen this morning .... Es l’estratègia de l’algorisme BOM

28 Algorisme BOM (Backward Oracle Matching)
23/04/2019 Com es determina la següent posició de la finestra? Com fa la comparació? Text : Patró : Autòmata: Factor Oracle Comproba si el sufix és factor del patró a Si la a no s’ha trobat As you have seen this morning .... Si arriben a l’estat final de l’autòmat amb la a a

29 Autòmata Factor Oracle: exemple d’algorisme
23/04/2019 Com fa la comparació? Es construeix l’autòmata del patró invers: Suposem que el patró és ATGTATG G A T I la cerca sobre el text : G T A C T A G A A T G T G T A G A C A T G T A T G G T G A... A T G T A T G As you have seen this morning ....

30 Autòmata Factor Oracle: exemple d’algorisme
23/04/2019 Com fa la comparació? Es construeix l’autòmata del patró invers: Suposem que el patró és ATGTATG G A T I la cerca sobre el text : G T A C T A G A A T G T G T A G A C A T G T A T G G T G A T G T A T G A T G T A T G As you have seen this morning ....

31 Autòmata Factor Oracle: exemple d’algorisme
23/04/2019 Com fa la comparació? Es construeix l’autòmata del patró invers: Suposem que el patró és ATGTATG G A T I la cerca sobre el text : G T A C T A G A A T G T G T A G A C A T G T A T G G T G A T G T A T G A T G T A T G A T G T A T G As you have seen this morning ....

32 Autòmata Factor Oracle: exemple d’algorisme
23/04/2019 Com fa la comparació? Es construeix l’autòmata del patró invers: Suposem que el patró és ATGTATG G A T I la cerca sobre el text : G T A C T A G A A T G T G T A G A C A T G T A T G G T G A T G T A T G A T G T A T G A T G T A T G A T G T A T G As you have seen this morning ....

33 Autòmata Factor Oracle: exemple d’algorisme
23/04/2019 Com fa la comparació? Es construeix l’autòmata del patró invers: Suposem que el patró és ATGTATG G A T I la cerca sobre el text : G T A C T A G A A T G T G T A G A C A T G T A T G G T G ... A T G T A T G A T G T A T G A T G T A T G A T G T A T G A T G T A T G As you have seen this morning ....

34 Autòmata Factor Oracle: exemple d’algorisme
23/04/2019 Com fa la comparació? Es construeix l’autòmata del patró invers: Suposem que el patró és ATGTATG G A T I la cerca sobre el text : G T A C T A G A A T G T G T A G A C A T G T A T G G T G ... A T G T A T G A T G T A T G A T G T A T G A T G T A T G A T G T A T G A T G T A T G As you have seen this morning ....


Download ppt "Tècniques i Eines Bioinformàtiques"

Similar presentations


Ads by Google