Qiong Cheng, Robert Harrison, Alexander Zelikovsky Computer Science in Georgia State University Oct. 17 2007 IEEE 7 th International Conference on BioInformatics.

Slides:



Advertisements
Similar presentations
© 2004 Goodrich, Tamassia Depth-First Search1 DB A C E.
Advertisements

gSpan: Graph-based substructure pattern mining
Network biology Wang Jie Shanghai Institutes of Biological Sciences.
Depth-First Search1 Part-H2 Depth-First Search DB A C E.
Frequent Subgraph Pattern Mining on Uncertain Graph Data
CS2420: Lecture 13 Vladimir Kulyukin Computer Science Department Utah State University.
Comparison of Networks Across Species CS374 Presentation October 26, 2006 Chuan Sheng Foo.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Structure discovery in PPI networks using pattern-based network decomposition Philip Bachman and Ying Liu BIOINFORMATICS System biology Vol.25 no
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
1 Seminar in Bioinformatics An efficient algorithm for detecting frequent subgraphs in biological networks Paper by: M. Koyuturk, A. Grama and W. Szpankowski.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Approximate Labelled Subtree Homeomorphism Based on:  “Approximate Labelled Subtree Homeomorphism” R. Y. Pinter, O.Rokhlenko, D. Tsur, M. Ziv-Ukelson.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Cost-based Optimization of Graph Queries Silke Trißl Humboldt-Universität zu Berlin Knowledge Management in Bioinformatics IDAR 2007.
BIBM 2008 Qiong Cheng Georgia State University Joint work with Piotr Berman (Pennstate) Robert Harrison (GSU) Alexander Zelikovsky (GSU) Fast Alignments.
Network Aware Resource Allocation in Distributed Clouds.
九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics.
A computational study of protein folding pathways Reducing the computational complexity of the folding process using the building block folding model.
Chapter 2 Graph Algorithms.
QNET: A tool for querying protein interaction networks Banu Dost +, Tomer Shlomi*, Nitin Gupta +, Eytan Ruppin*, Vineet Bafna +, Roded Sharan* + University.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Pairwise alignment of DNA/protein sequences I519 Introduction to Bioinformatics, Fall 2012.
Example Question on Linear Program, Dual and NP-Complete Proof COT5405 Spring 11.
Improved Approximation Algorithms for the Quality of Service Steiner Tree Problem M. Karpinski Bonn University I. Măndoiu UC San Diego A. Olshevsky GaTech.
1 Frequent Subgraph Mining Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY June 12, 2010.
An Efficient Algorithm for Enumerating Pseudo Cliques Dec/18/2007 ISAAC, Sendai Takeaki Uno National Institute of Informatics & The Graduate University.
Introduction to Bioinformatics Biological Networks Department of Computing Imperial College London March 18, 2010 Lecture hour 18 Nataša Pržulj
Qiong Cheng, Dipendra Kaur, Robert Harrison, Alexander Zelikovsky Computer Science in Georgia State University Dec RECOMB Satellite Conference.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
MA/CSSE 473 Day 28 Dynamic Programming Binomial Coefficients Warshall's algorithm Student questions?
University at BuffaloThe State University of New York Lei Shi Department of Computer Science and Engineering State University of New York at Buffalo Frequent.
Greedy algorithm for obtaining Minimum Feedback vertex set MFVS delete degree 1/0 vertices from V and set remaining vertices to V’ MFVS←  while V’  
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Network Evolution Statistics of Networks Comparing Networks Networks in Cellular Biology A. Metabolic Pathways B. Regulatory Networks C. Signaling Pathways.
Data Structures and Algorithms in Parallel Computing Lecture 2.
Computer Science and Engineering TreeSpan Efficiently Computing Similarity All-Matching Gaoping Zhu #, Xuemin Lin #, Ke Zhu #, Wenjie Zhang #, Jeffrey.
Introduction to biological molecular networks
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Graphs A graphs is an abstract representation of a set of objects, called vertices or nodes, where some pairs of the objects are connected by links, called.
Depth-First Search Lecture 21: Graph Traversals
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Predicting Protein Function Annotation using Protein- Protein Interaction Networks By Tamar Eldad Advisor: Dr. Yanay Ofran Computational Biology.
G LOBAL S IMILARITY B ETWEEN M ULTIPLE B IONETWORKS Yunkai Liu Computer Science Department University of South Dakota.
Introduction to Graph Theory By: Arun Kumar (Asst. Professor) (Asst. Professor)
Class 2: Graph Theory IST402.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Chapter 05 Introduction to Graph And Search Algorithms.
Indexing and Mining Free Trees Yun Chi, Yirong Yang, Richard R. Muntz Department of Computer Science University of California, Los Angeles, CA {
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Mining Complex Data COMP Seminar Spring 2011.
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
CSC 213 – Large Scale Programming Lecture 31: Graph Traversals.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Khaled M. Alzoubi, Peng-Jun Wan, Ophir Frieder
CSCI2950-C Lecture 12 Networks
Network Motif Discovery using Subgraph Enumeration and Symmetry-Breaking by Grochow and Kellis Wooyoung Kim 4/3/2009 CSc 8910 Analysis of Biological Network,
Comparative RNA Structural Analysis
Discovering Larger Network Motifs
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Modelling Structure and Function in Complex Networks
Approximate Graph Mining with Label Costs
Presentation transcript:

Qiong Cheng, Robert Harrison, Alexander Zelikovsky Computer Science in Georgia State University Oct IEEE 7 th International Conference on BioInformatics & BioEngineering Homomorphism Mapping in Metabolic Pathways

Outline Comparison of metabolic pathways Graph mappings: embeddings & homomorphisms Min cost homomorphism problem Enzyme similarity Optimal DP algorithm for trees Searching metabolic networks for pathway motifs pathway holes Future work

Metabolic pathway & pathways model Metabolic pathways model A portion of pentose phosphate pathway Metabolic pathway

Comparison of metabolic pathways Enzyme Similarity Pathway topology Similarity Enzyme similarity and pathway topology together represent the similarity of pathway functionality. Mismatch/Substitute match

Related work Linear topology Tree topology DCBA D’ X’X’ A’A’ (Forst & Schulten[1999], Chen & Hofestaedt[2004];) D CB A X’X’ B’B’ A’A’ (Pinter [2005]  V G    V T  log  V G  V G  V T  log  V T  ) Arbitrary topology Mapping : Linear pattern  Graph (Kelly et al 2004) (  V T    V G    ) Exhaustively search (Sharan et al 2005 (  V T    V G    )  Yang et al 2007 (   VG   V G    )

Enzyme mapping cost EC (Enzyme Commission) notation Enzyme similarity score Δ Measure Δ by tight reaction property Enzyme X = x1. x2. x3. x4 Enzyme Y = y1. y2. y3. y4 = == = Δ[X, Y ] = 1 = == Δ[X, Y ] = 10 Δ[X, Y ] = +∞ Measure Δ by the lowest common upper class distribution Enzyme D = d1. d2. d3. d4 Δ[X, Y ] = log 2 c(X, Y ) = otherwise

Graph mappings: embeddings & homomorphisms Isomorphism T G f Homomorphism Isomorphic embedding Homeomorphic embedding Homomorphism f : T  G: f v : V T  V G f e : E T  paths of G Edge-to-path cost : (|f e (e)|-1) We allow different enzymes to be mapped to the same enzyme. Homomorphism cost  e in E T (|f e (e)|-1) + Δ(v, f v (v))  v in T =

Min cost homomorphism problem A multi-source tree is a directed graph, whose underlying undirected graph is a tree. ignoring direction Given a multisource tree T = (Pattern) and an arbitrary graph G = (Text), find min cost homomorphism f : T  G

Preprocessing of text graph Transitive closure of G is graph G*=(V, E*), where E*={(i,j): there is i-j-path in G} Text G AB C D E F Transitive Closure of G : G* AB C D E F Transitive closure

Pattern graph ordering Pattern T ab c d Ordering c bd a Construct ordered pattern T ’ DFS traversal Processing order in opposite way Each edge e i in T ’ is the unique edge connecting v i with the previous vertices in the order Ordered pattern T ’

DP table u1u1 … ujuj … u |V G | a b c d Text arbitrary order DT[a, u j ] min cost homomorphism mapping from T’s subgraph induced by previous vertices in the order in to G* Pattern T ab c d

Filling DP table  is penalty for gaps Δ(v i, u j ) if v i is a leaf in T Δ (v i, u j ) + ∑ l=1 to adj(vi) Min j’=1 to |VG| C(i l, j’) if v i is a leaf in T = DT[i, j] i<|VT| j<|VG| Recursive function h(j, j l ) = #(hops between u j and u j l in G) C[i l, j l ] = DT[i l, j l ] + (h(j, j l ) - 1) vilvil vivi Pattern T G* u j’ ujuj h(j, j ’ )

Runtime Analysis Transitive closure takes O(|V G ||E G |). The total runtime is O(|V G ||E G |+|V G* ||V T |). Pattern graph ordering takes O(|V T | + |E T |) Dynamic programming - Calculate min contribution of all child pairs of node pair (v i ∈ T,u j ∈ G) takes t ij = deg T (v i )deg G* (u j ) - Filling DT takes  j=1 to |V G |  i=1 to |V T | t ij =  j=1 to |V G | deg G* (u j )  i=1 to |V T | deg T (v i ) = 2|E G* ||E T |

Statistical significance Randomized P-Value computation Random degree-conserved graph generation: Reshuffle nodes ab cd ab cd Reshuffle edges Reshuffle edge

Experiments & applications Identifying conserved pathways 24 pathways that are conserved across all 4 species 18 more pathways that are conserved across at least three of these species Resolving ambiguity Discovering pathways holes All-against-all mappings among S. cerevisiae, B. subtilis, T. thermophilus, and E.coli

Observation example 1 : Resolving Ambiguity Mapping of glutamate degradation VII pathways from B. subtilis to T. thermophilus (p < 0.01).

Future work Approximation algorithm to handle with the comparison of general graphs Mining protein interaction network A web-oriented tool will be developed Discovery of critical elements or modules based on graph comparison Discovery of evolution relation of organisms by pathway comparison of different organisms at different time points Integration with genome database