Presentation is loading. Please wait.

Presentation is loading. Please wait.

Qiong Cheng, Robert Harrison, Alexander Zelikovsky Computer Science in Georgia State University Oct. 17 2007 IEEE 7 th International Conference on BioInformatics.

Similar presentations


Presentation on theme: "Qiong Cheng, Robert Harrison, Alexander Zelikovsky Computer Science in Georgia State University Oct. 17 2007 IEEE 7 th International Conference on BioInformatics."— Presentation transcript:

1 Qiong Cheng, Robert Harrison, Alexander Zelikovsky Computer Science in Georgia State University Oct. 17 2007 IEEE 7 th International Conference on BioInformatics & BioEngineering Homomorphism Mapping in Metabolic Pathways

2 Outline Comparison of metabolic pathways Graph mappings: embeddings & homomorphisms Min cost homomorphism problem Enzyme similarity Optimal DP algorithm for trees Searching metabolic networks for pathway motifs pathway holes Future work

3 Metabolic pathway & pathways model Metabolic pathways model A portion of pentose phosphate pathway 1.1.1.49 1.1.1.34 2.7.1.13 3.1.1.31 1.1.1.44 Metabolic pathway

4 Comparison of metabolic pathways Enzyme Similarity Pathway topology Similarity Enzyme similarity and pathway topology together represent the similarity of pathway functionality. Mismatch/Substitute match

5 Related work Linear topology Tree topology DCBA D’ X’X’ A’A’ (Forst & Schulten[1999], Chen & Hofestaedt[2004];) D CB A X’X’ B’B’ A’A’ (Pinter [2005]  V G    V T  log  V G  V G  V T  log  V T  ) Arbitrary topology Mapping : Linear pattern  Graph (Kelly et al 2004) (  V T    V G    ) Exhaustively search (Sharan et al 2005 (  V T    V G    )  Yang et al 2007 (   VG   V G    )

6 Enzyme mapping cost EC (Enzyme Commission) notation Enzyme similarity score Δ Measure Δ by tight reaction property Enzyme X = x1. x2. x3. x4 Enzyme Y = y1. y2. y3. y4 = == = Δ[X, Y ] = 1 = == Δ[X, Y ] = 10 Δ[X, Y ] = +∞ Measure Δ by the lowest common upper class distribution Enzyme D = d1. d2. d3. d4 Δ[X, Y ] = log 2 c(X, Y ) = otherwise

7 Graph mappings: embeddings & homomorphisms Isomorphism T G f Homomorphism Isomorphic embedding Homeomorphic embedding Homomorphism f : T  G: f v : V T  V G f e : E T  paths of G Edge-to-path cost : (|f e (e)|-1) We allow different enzymes to be mapped to the same enzyme. Homomorphism cost  e in E T (|f e (e)|-1) + Δ(v, f v (v))  v in T =

8 Min cost homomorphism problem A multi-source tree is a directed graph, whose underlying undirected graph is a tree. ignoring direction Given a multisource tree T = (Pattern) and an arbitrary graph G = (Text), find min cost homomorphism f : T  G

9 Preprocessing of text graph Transitive closure of G is graph G*=(V, E*), where E*={(i,j): there is i-j-path in G} Text G AB C D E F Transitive Closure of G : G* AB C D E F 1 1 1 1 1 1 2 2 3 2 2 3 Transitive closure

10 Pattern graph ordering Pattern T ab c d Ordering c bd a Construct ordered pattern T ’ DFS traversal Processing order in opposite way Each edge e i in T ’ is the unique edge connecting v i with the previous vertices in the order Ordered pattern T ’

11 DP table u1u1 … ujuj … u |V G | a b c d Text arbitrary order DT[a, u j ] min cost homomorphism mapping from T’s subgraph induced by previous vertices in the order in to G* Pattern T ab c d

12 Filling DP table  is penalty for gaps Δ(v i, u j ) if v i is a leaf in T Δ (v i, u j ) + ∑ l=1 to adj(vi) Min j’=1 to |VG| C(i l, j’) if v i is a leaf in T = DT[i, j] i<|VT| j<|VG| Recursive function h(j, j l ) = #(hops between u j and u j l in G) C[i l, j l ] = DT[i l, j l ] + (h(j, j l ) - 1) vilvil vivi Pattern T G* u j’ ujuj h(j, j ’ )

13 Runtime Analysis Transitive closure takes O(|V G ||E G |). The total runtime is O(|V G ||E G |+|V G* ||V T |). Pattern graph ordering takes O(|V T | + |E T |) Dynamic programming - Calculate min contribution of all child pairs of node pair (v i ∈ T,u j ∈ G) takes t ij = deg T (v i )deg G* (u j ) - Filling DT takes  j=1 to |V G |  i=1 to |V T | t ij =  j=1 to |V G | deg G* (u j )  i=1 to |V T | deg T (v i ) = 2|E G* ||E T |

14 Statistical significance Randomized P-Value computation Random degree-conserved graph generation: Reshuffle nodes ab cd ab cd Reshuffle edges Reshuffle edge

15 Experiments & applications Identifying conserved pathways 24 pathways that are conserved across all 4 species 18 more pathways that are conserved across at least three of these species Resolving ambiguity Discovering pathways holes All-against-all mappings among S. cerevisiae, B. subtilis, T. thermophilus, and E.coli

16 Observation example 1 : Resolving Ambiguity 2.6.1.1 1.2.4.- 1.2.4.2 2.3.1.- 2.3.1.61 6.2.1.5 1.3.99.1 4.2.1.2 1. 1.1.1.82 Mapping of glutamate degradation VII pathways from B. subtilis to T. thermophilus (p < 0.01).

17 Future work Approximation algorithm to handle with the comparison of general graphs Mining protein interaction network A web-oriented tool will be developed Discovery of critical elements or modules based on graph comparison Discovery of evolution relation of organisms by pathway comparison of different organisms at different time points Integration with genome database


Download ppt "Qiong Cheng, Robert Harrison, Alexander Zelikovsky Computer Science in Georgia State University Oct. 17 2007 IEEE 7 th International Conference on BioInformatics."

Similar presentations


Ads by Google