Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparative RNA Structural Analysis

Similar presentations


Presentation on theme: "Comparative RNA Structural Analysis"β€” Presentation transcript:

1 Comparative RNA Structural Analysis

2 Overview Comparative RNA Structural Analysis
Method 1: Align, then fold Method 2: Fold, then compare

3 Overview Comparative RNA Structural Analysis
Method 1: Align, then fold Method 2: Fold, then compare

4 Comparative RNA Structural Analysis Problem Definition
Input: A set of sequences with assumed structural similarities. Output: Alignment, and common structural elements.

5 Possible approaches Homologous RNA sequences 1 Sequence alignment
Aligned Sequences Fold alignments Aligned Structures

6 Possible approaches Homologous RNA sequences 1 2 Fold Sequence
AUCCCCGUAUCGAUC CUCGGCGUAUCGGUC 1 2 Fold Sequences Sequence alignment Homologous RNA secondary Structures Aligned Sequences Structure Alignment Fold alignments Aligned Structures

7 Simultaneous Fold and Alignment
Possible approaches Homologous RNA sequences AUCCCCGUAUCGAUC CUCGGCGUAUCGGUC 1 3 2 Fold Sequences Sequence alignment Sankoff Simultaneous Fold and Alignment Homologous RNA secondary Structures Aligned Sequences Structure Alignment Fold alignments Aligned Structures

8 Align, then fold First step: multiple alignment
We want to use an algorithm we know to fold our aligned sequences. How can we modify Nussinov algorithm to fold multiple alignments? A C G T G G A G A A C G G A C C C T A A A G G G G A T A T A G C A A T T A T C C G G A T T A G T T C C G G A T T G G A C G A A T A G G G C T A A A T G C C A

9 Align, then fold We need a new scoring function
Scoring a base pair is different than scoring a pair of columns in our alignment. Using the new scoring function, we can apply Nussinov algorithm on the converted input (with slight changes).

10 Covariation Columns that β€œchange together” construct a stem
A C G U G G A G A A C G G A C C C U A A A G G G G A U A U A G C A A U U A U C C G G A U U A G U U C C G G A U U G G A C G A A U A G G G C U A A A U G C C A

11 The Mixy algorithm For each column 𝑖 in the alignment, define 𝑓 𝑖 π‘₯ , π‘₯∈ 𝐴,π‘ˆ,𝐢,𝐺 to be π‘₯’s frequency in column 𝑖. A C G U G A A C G G A C C C U G G G G G A A U A G U U A U G 2 3 𝑓 2 𝐢 =

12 The Mixy algorithm For each column 𝑖 in the alignment, define 𝑓 𝑖 π‘₯ , π‘₯∈ 𝐴,π‘ˆ,𝐢,𝐺 to be π‘₯’s frequency in column 𝑖. For each 𝑖 and 𝑗, define 𝑓 𝑖,𝑗 π‘₯,𝑦 , π‘₯,π‘¦βˆˆ{𝐴,π‘ˆ,𝐢,𝐺} to be the frequency of π‘₯ in column 𝑖 and 𝑦 column 𝑗 on the same sequence. A C G U G A A C G G A C C C U G G G G G A A U A G U U A U G 2 3 𝑓 2,9 𝐢,𝐺 = 𝑓 2,9 𝐴,𝐺 =

13 The Mixy algorithm For each column 𝑖 in the alignment, define 𝑓 𝑖 π‘₯ , π‘₯∈ 𝐴,π‘ˆ,𝐢,𝐺 to be π‘₯’s frequency in column 𝑖. For each 𝑖 and 𝑗, define 𝑓 𝑖,𝑗 π‘₯,𝑦 , π‘₯,π‘¦βˆˆ{𝐴,π‘ˆ,𝐢,𝐺} to be the frequency of π‘₯ in column 𝑖 and 𝑦 column 𝑗 on the same sequence. Clearly, if π‘₯ and 𝑦 are independent, 𝑓 𝑖,𝑗 π‘₯,𝑦 𝑓 𝑖 π‘₯ βˆ— 𝑓 𝑗 𝑦 β‰ˆ1. A C G U G A A C G G A C C C U G G G G G A A U A G U U A U G 2 3 𝑓 2,9 𝐢,𝐺 = 𝑓 2,9 𝐴,𝐺 =

14 The Mixy algorithm Now, to measure mutual information between columns 𝑖 and 𝑗 we’ll define: 𝐻 𝑖,𝑗 = π‘₯,𝑦 𝑓 𝑖,𝑗 π‘₯,𝑦 log 2 𝑓 𝑖,𝑗 π‘₯,𝑦 𝑓 𝑖 π‘₯ βˆ— 𝑓 𝑗 𝑦 A C G U G A A C G G A C C C U G G G G G A A U A G U U A U G 𝑓 2,9 𝐢,𝐺 log 2 𝑓 2,9 𝐢,𝐺 𝑓 2 𝐢 βˆ— 𝑓 9 𝐺 + 𝑓 2,9 𝐴,π‘ˆ log 2 𝑓 2,9 𝐴,π‘ˆ 𝑓 2 𝐴 βˆ— 𝑓 9 π‘ˆ 𝐻 2,9 = = 2 3 βˆ— log βˆ— βˆ— log βˆ— = 2 3 βˆ— log βˆ—log⁑(3)= 2 3 βˆ— βˆ—1.58=0.526

15 The Mixy algorithm Now, to measure mutual information between columns 𝑖 and 𝑗 we’ll define: 𝐻 𝑖,𝑗 = π‘₯,𝑦 𝑓 𝑖,𝑗 π‘₯,𝑦 log 2 𝑓 𝑖,𝑗 π‘₯,𝑦 𝑓 𝑖 π‘₯ βˆ— 𝑓 𝑗 𝑦 𝑓 1,10 𝐴,𝐺 log 2 𝑓 1,10 𝐴,𝐺 𝑓 1 𝐴 βˆ— 𝑓 10 𝐺 = 3 3 βˆ— log βˆ— =1βˆ—0=0 A C G U G A A C G G A C C C U G G G G G A A U A G U U A U G 𝐻 1,10 =

16 The Mixy algorithm Now, to measure mutual information between columns 𝑖 and 𝑗 we’ll define: 𝐻 𝑖,𝑗 = π‘₯,𝑦 𝑓 𝑖,𝑗 π‘₯,𝑦 log 2 𝑓 𝑖,𝑗 π‘₯,𝑦 𝑓 𝑖 π‘₯ βˆ— 𝑓 𝑗 𝑦 𝑓 3,7 𝐺,𝐴 log 2 𝑓 3,7 𝐺,𝐴 𝑓 3 𝐺 βˆ— 𝑓 7 𝐴 + 𝑓 3,7 𝐢,𝐺 log 2 𝑓 3,7 𝐢,𝐺 𝑓 3 𝐢 βˆ— 𝑓 7 𝐺 + 𝑓 3,7 π‘ˆ,π‘ˆ log 2 𝑓 3,7 π‘ˆ,π‘ˆ 𝑓 3 π‘ˆ βˆ— 𝑓 7 π‘ˆ + 𝑓 3,7 𝐴,𝐢 log 2 𝑓 3,7 𝐴,𝐢 𝑓 3 𝐴 βˆ— 𝑓 7 𝐢 = =4βˆ— 1 4 βˆ— log βˆ— =1βˆ—π‘™π‘œπ‘” 4 =2 A C G U G A A C G G A C C C U G G G G G A A U A G U U A U G A A A A G U C U U G 𝐻 3,7 =

17 The Mixy algorithm 0≀ 𝐻 𝑖,𝑗 ≀2
Now, to measure mutual information between columns 𝑖 and 𝑗 we’ll define: 𝐻 𝑖,𝑗 = π‘₯,𝑦 𝑓 𝑖,𝑗 π‘₯,𝑦 log 2 𝑓 𝑖,𝑗 π‘₯,𝑦 𝑓 𝑖 π‘₯ βˆ— 𝑓 𝑗 𝑦 0≀ 𝐻 𝑖,𝑗 ≀2 A C G U G A A C G G A C C C U G G G G G A A U A G U U A U G A A A A G U C U U G Higher value means that columns 𝑖 and 𝑗 are correlated Lower value means that columns 𝑖 and 𝑗 are not correlated

18 Overview Comparative RNA Structural Analysis
Method 1: Align, then fold Method 2: Fold, then compare

19 Ordered rooted tree representation
Shapiro, 1988: nodes - elements of secondary structure (hairpin loop, bulge, internal loop or multi-loop). edges - base-paired (stem) regions.

20 Ordered rooted tree representation
Shapiro, 1988: nodes - elements of secondary structure (hairpin loop, bulge, internal loop or multi-loop). edges - base-paired (stem) regions. Zhang, 1998: nodes - unpaired bases (leaves) or paired bases (internal nodes). Each node is labeled with a base or a pair of bases. edges - connecting consecutive stem base-pairs or a leaf base with the last base-pair in the corresponding stem.

21 Problem definition The subtree isomorphism problem [Matula, 1968,1978]: Given a pattern tree P and a text tree T, find a subtree of T which is isomorphic to P, In other words: find if some subtree of T is identical in structure to P The subtree homeomorphism problem [Chung, 1987, Reyner, 1977, Pinter et al., 2004]: Similar to isomorphism problem, where degree-2 nodes can be deleted from the text tree.

22 Subtree homeomorphism problem
Let P and 𝑇 be two ordered, rooted trees. Let 𝑑 be a subtree of 𝑇, rooted at node π‘£βˆˆπ‘‡ A mapping 𝛼: P β†’ t is a one-to-one matching of a node of P to a node of 𝑑. The mapping must preserve the ancestor relations of the nodes and their relative order. The subtree homeomorphism score of a mapping, denoted S(𝛼,v), is: S(𝛼,v) node-to-node similarity score function π‘’βˆˆπ‘ƒ, π‘£βˆˆπ‘‘ edge-to-edge similarity score function euοƒŽP, evοƒŽt The penalty of deleting a degree-2-node from T The penalty for deleting any other node in T

23 Subtree homeomorphism problem
Given P and 𝑇, we want to find a subtree 𝑑 in T such that the score S(𝛼,v) is maximal How can we do that? Ho can we solve this problem efficiently? Dynamic programming!

24 Subtree homeomorphism problem
Isomorphism Homeoomorphism

25 Rooted Ordered Subtree Isomorphism
Given trees 𝑃 and 𝑇, and the scoring table below, compute Labeled Ordered Rooted Subtree Isomorphism of 𝑃 and 𝑇. No deletions are allowed from 𝑃 Only deletions of complete subtrees from 𝑇 are allowed, with penalty = 0 𝑃 𝑇 b e f c d a c’ f’ a' e' b' d' g' h’ 4 1 3

26 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b e f c d a Rows are post ordered 𝑃 nodes Columns are post ordered 𝑇 nodes 4 1 3

27 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b e f c d a 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

28 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c d a 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

29 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c d a 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

30 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c d a β„Žπ‘’π‘–π‘”β„Žπ‘‘ 𝑐 >β„Žπ‘’π‘–π‘”β„Žπ‘‘( 𝑏 β€² ) 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

31 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a β„Žπ‘’π‘–π‘”β„Žπ‘‘ 𝑐 >β„Žπ‘’π‘–π‘”β„Žπ‘‘( 𝑏 β€² ) 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

32 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

33 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

34 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 4 1 3

35 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 4 1 3

36 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 4 1 3 βˆ’βˆž

37 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 4 1 3 βˆ’βˆž

38 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 𝟎 4 1 3 βˆ’βˆž

39 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 𝟎 4 1 3 βˆ’βˆž

40 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 𝟎 4 1 3 1 βˆ’βˆž

41 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 𝟎 𝟎 𝟎 4 1 3 1 4 βˆ’βˆž 4 1 1

42 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 𝟎 𝟎 𝟎 4 1 3 1 4 βˆ’βˆž 4 1 1

43 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 𝟎 𝟎 𝟎 4 1 3 𝑆 𝑐, 𝑐 β€² =3+5=8 1 4 βˆ’βˆž 4 1 1

44 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a Small DP table eβ€˜ fβ€˜ gβ€˜ e f 𝟎 𝟎 𝟎 4 1 3 𝑆 𝑐, 𝑐 β€² =3+5=8 βˆ’βˆž 1 4 4 1 1

45 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

46 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a Small DP table hβ€˜ e f 𝟎 4 1 3 4 βˆ’βˆž 1

47 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a Small DP table hβ€˜ e f 𝟎 4 1 3 4 βˆ’βˆž 1

48 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

49 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a π‘‘π‘’π‘π‘‘β„Ž 𝑐 >π‘‘π‘’π‘π‘‘β„Ž(π‘Žβ€²) 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

50 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a π‘‘π‘’π‘π‘‘β„Ž 𝑐 >π‘‘π‘’π‘π‘‘β„Ž(π‘Žβ€²) 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

51 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a β„Žπ‘’π‘–π‘”β„Žπ‘‘ π‘Ž >β„Žπ‘’π‘–π‘”β„Žπ‘‘( 𝑐 β€² ) 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

52 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a β„Žπ‘’π‘–π‘”β„Žπ‘‘ π‘Ž >β„Žπ‘’π‘–π‘”β„Žπ‘‘( 𝑐 β€² ) 𝑆 𝑒,𝑣 = 𝑖𝑓 𝑒 π‘Žπ‘›π‘‘ 𝑣 π‘Žπ‘Ÿπ‘’ π‘™π‘’π‘Žπ‘£π‘’π‘  π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ 4 1 3

53 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a Small DP table bβ€˜ cβ€˜ dβ€˜ b c d 4 1 3

54 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a Small DP table bβ€˜ cβ€˜ dβ€˜ b c d 4 1 3 4 4 3 βˆ’βˆž 8 βˆ’βˆž 3 3 4

55 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a 20 Small DP table bβ€˜ cβ€˜ dβ€˜ b c d 4 1 3 4 4 3 𝑆 𝑐, 𝑐 β€² =4+16=20 βˆ’βˆž 8 βˆ’βˆž 3 3 4

56 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a 20 Where is the solution? 4 1 3

57 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a 20 4 1 3

58 Rooted Ordered Subtree Isomorphism
f c d a 𝑃 𝑇 c’ f’ a' e' b' d' g' h’ b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a 20 eβ€˜ fβ€˜ gβ€˜ e f bβ€˜ cβ€˜ dβ€˜ b c d 𝟎 4 1 3 4 4 3 1 4 βˆ’βˆž 8 βˆ’βˆž 4 1 1 3 3 4

59 Running time complexity
b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a 20 If 𝑃 has m nodes and 𝑇 has 𝑛 node There are π‘›π‘š cells in the large DP table In the worst case – for each cell we will compute a small DP table with π‘šπ‘› cells Resulting in 𝑂( 𝑛 2 π‘š 2 ) running time Is there a tighter bound? 𝑃 𝑇 b e f c d a c’ f’ a' e' b' d' g' h’

60 Running time complexity
b’ eβ€˜ fβ€˜ gβ€˜ cβ€˜ hβ€˜ dβ€˜ a' b 4 1 3 e f c βˆ’βˆž 8 d a 20 If 𝑃 has m nodes and 𝑇 has 𝑛 node Each node 𝑒 in 𝑃 will be in a small DP table only when its father is compared to a node in 𝑇 A father of a node in P is compared at most 𝑛 times βŸΉπ‘‚(π‘šπ‘›) Symmetrically, for a node 𝑣 in T Overall: 𝑂 π‘šπ‘›+π‘šπ‘›+π‘šπ‘› =𝑂(π‘šπ‘›) 𝑃 𝑇 b e f c d a c’ f’ a' e' b' d' g' h’ Large DP Small DP for a node in P Small DP for a node in T


Download ppt "Comparative RNA Structural Analysis"

Similar presentations


Ads by Google