Download presentation
Presentation is loading. Please wait.
Published byJennifer Quinn Modified over 10 years ago
1
CSC 448: Bioninformatics Algorithms Alex Dekhtyar Ukkonen’s Algorithm for Generalized Suffix Trees
2
Example for two DNA sequences: T and T’=reverse(complement(T)) T = AATGTT T’ = AACATT
3
Steps 1.Create SuffixTree(T$) using Ukkonen’s algorithm. Keep suffix links. 2. Add “T:” to all leaf labels (designate current labels) 3.Traverse SuffixTree(T$) using the prefix of T’ The stoppage point is new active point 4. Use Ukkonen’s algorithm to insert the remainder of T’ 4.1. Label leaves “T’: [x, ∞]” 4.2. modification: traverse to existing leaves to leave a label
4
T = AATGTTT’ = AACATT Tree Trie ε ┴ ε ┴
5
T = AATGTTT’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 1: insert fist string T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT
6
T = AATGTTT’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 1: insert fist string T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT Last boundary path - Last active point
7
T A T = AATGTTT’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 1: insert fist string T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT Last boundary path - Last active point 2,∞ A 3,∞ 4,∞ T G 6,∞ TG Last active point
8
T A T = AATGTT$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 1: insert fist string Step 1.5: finish the tree T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT Last boundary path - Last active point 2,∞ A 3,∞ 4,∞ T G 6,∞ T G Last active point 7,∞ $ $
9
T A T = AATGTT$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT Last boundary path - Last active point 2,∞ A 3,∞ 4,∞ T G 6,∞ T G 7,∞ $ $ New active point
10
T A T = AATGTT$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point 2,∞ A 3,∞ 4,∞ T G 6,∞ T G 7,∞ $ $ AAC AC C
11
T A T = AATGTT$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ $ $ AAC AC C Make leaf nodes “generalized”
12
T A T = AATGTT$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ $ $ AAC AC C T’:3,∞ C T C C
13
T A T = AATGTT$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ $ $ AAC AC C T’:3,∞ C T C C AACA ACA CA - end point Nothing to do!
14
T A T = AATGTT$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ $ $ AAC AC C T’:3,∞ C T C C AACA ACA CA - end point AACAT ACAT CAT Nothing to do!
15
T A T = AATGTT$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ $ $ AAC AC C T’:3,∞ C T C C AACA ACA CA - end point AACAT ACAT CAT ATT G T’:6,∞ T
16
T A T = AATGTT$T’ = AACATT Tree Trie A AA AAT AATG AATGT AATGTT ε ┴ ε ┴ Step 2: Traverse the prefix of T’ Step 3: Start inserting the rest of T’ T AT ATG TG G ATGT TGT GT ATGTT TGTT GTT TT - active point T:2,∞ A T:3,∞ T:4,∞ T G T:6,∞ T G T:7,∞ $ $ AAC AC C T’:3,∞ C T C C AACA ACA CA - end point AACAT ACAT CAT ATT G T’:6,∞ T Crucial bit coming! T’:6,∞
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.