# CSC 448: Bioninformatics Algorithms Alex Dekhtyar Ukkonen’s Algorithm for Generalized Suffix Trees.

## Presentation on theme: "CSC 448: Bioninformatics Algorithms Alex Dekhtyar Ukkonen’s Algorithm for Generalized Suffix Trees."— Presentation transcript:

CSC 448: Bioninformatics Algorithms Alex Dekhtyar Ukkonen’s Algorithm for Generalized Suffix Trees

Example for two DNA sequences: T and T’=reverse(complement(T)) T = AATGTT T’ = AACATT

Steps 1.Create SuffixTree(T\$) using Ukkonen’s algorithm. Keep suffix links. 2. Add “T:” to all leaf labels (designate current labels) 3.Traverse SuffixTree(T\$) using the prefix of T’ The stoppage point is new active point 4. Use Ukkonen’s algorithm to insert the remainder of T’ 4.1. Label leaves “T’: [x, ∞]” 4.2. modification: traverse to existing leaves to leave a label

T = AATGTTT’ = AACATT Tree Trie ε ┴ ε ┴

T = AATGTTT’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 1: insert fist string T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT

T = AATGTTT’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 1: insert fist string T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT Last boundary path - Last active point

T A T = AATGTTT’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 1: insert fist string T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT Last boundary path - Last active point 2,∞ A 3,∞ 4,∞ T G 6,∞ TG Last active point

T A T = AATGTT\$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 1: insert fist string Step 1.5: finish the tree T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT Last boundary path - Last active point 2,∞ A 3,∞ 4,∞ T G 6,∞ T G Last active point 7,∞ \$ \$

T A T = AATGTT\$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT Last boundary path - Last active point 2,∞ A 3,∞ 4,∞ T G 6,∞ T G 7,∞ \$ \$ New active point

T A T = AATGTT\$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point 2,∞ A 3,∞ 4,∞ T G 6,∞ T G 7,∞ \$ \$ AAC AC C

T A T = AATGTT\$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ \$ \$ AAC AC C Make leaf nodes “generalized”

T A T = AATGTT\$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ \$ \$ AAC AC C T’:3,∞ C T C C

T A T = AATGTT\$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ \$ \$ AAC AC C T’:3,∞ C T C C AACA ACA CA - end point Nothing to do!

T A T = AATGTT\$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ \$ \$ AAC AC C T’:3,∞ C T C C AACA ACA CA - end point AACAT ACAT CAT Nothing to do!

T A T = AATGTT\$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ \$ \$ AAC AC C T’:3,∞ C T C C AACA ACA CA - end point AACAT ACAT CAT ATT G T’:6,∞ T

T A T = AATGTT\$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ \$ \$ AAC AC C T’:3,∞ C T C C AACA ACA CA - end point AACAT ACAT CAT ATT G T’:6,∞ T Crucial bit coming! T’:6,∞

Download ppt "CSC 448: Bioninformatics Algorithms Alex Dekhtyar Ukkonen’s Algorithm for Generalized Suffix Trees."

Similar presentations