Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li.

Slides:



Advertisements
Similar presentations
CSE 211 Discrete Mathematics
Advertisements

Lecture 15. Graph Algorithms
Bayesian Networks, Winter Yoav Haimovitch & Ariel Raviv 1.
1 A Faster Approximation Algorithm For The Steiner Problem In Graphs Kurt Mehlhorn. Information Processing Letters, 27(3):125–128, 高等演算法二
 Graph Graph  Types of Graphs Types of Graphs  Data Structures to Store Graphs Data Structures to Store Graphs  Graph Definitions Graph Definitions.
Applied Discrete Mathematics Week 12: Trees
Graph. Undirected Graph Directed Graph Simple Graph.
1 Section 8.2 Graph Terminology. 2 Terms related to undirected graphs Adjacent: 2 vertices u & v in an undirected graph G are adjacent (neighbors) in.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
International Workshop on Computer Vision - Institute for Studies in Theoretical Physics and Mathematics, April , Tehran 1 IV COMPUTING SIZE.
Convex Grid Drawings of 3-Connected Plane Graphs Erik van de Pol.
Graphs and Trees This handout: Trees Minimum Spanning Tree Problem.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
CS5371 Theory of Computation Lecture 1: Mathematics Review I (Basic Terminology)
Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S.
1.1. Graph Models Two basic notions: Graphs Directed Graphs.
Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects two different vertices. Edges are.
Additive Spanners for k-Chordal Graphs V. D. Chepoi, F.F. Dragan, C. Yan University Aix-Marseille II, France Kent State University, Ohio, USA.
1 Section 8.4 Connectivity. 2 Paths In an undirected graph, a path of length n from u to v, where n is a positive integer, is a sequence of edges e 1,
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
"Quadratic time algorithms for finding common intervals in two and more sequences" by T. Schmidt and J. Stoye, Proc. 15th Annual Symposium on Combinatorial.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Mining Graphs with Constrains on Symmetry and Diameter Natalia Vanetik Deutsche Telecom Laboratories at Ben-Gurion University IWGD10 workshop July 14th,
Quadratic Time Algorithms for Finding Common Intervals in Two and More Sequences Thomas Schmidt Jens Stoye CPM 2004, Istanbul.
Graph Operations And Representation. Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Minimum Spanning Tree Algorithms. What is A Spanning Tree? u v b a c d e f Given a connected, undirected graph G=(V,E), a spanning tree of that graph.
Graphs CS /02/05 Graphs Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved Definition.
GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,
1 ELEC692 Fall 2004 Lecture 1b ELEC692 Lecture 1a Introduction to graph theory and algorithm.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Graphs Simple graph G=(V,E) V = V(G) ={1,2,3,4} – vertices E = E(G) = {a,b,c,d,e} – edges Edge a has end- vertices 1 and 2. Vertices 1 and 2 are adjacent:
 What is a graph? What is a graph?  Directed vs. undirected graphs Directed vs. undirected graphs  Trees vs graphs Trees vs graphs  Terminology: Degree.
Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.
ICOM 6115©Manuel Rodriguez-Martinez ICOM 6115 – Computer Networks and the WWW Manuel Rodriguez-Martinez, Ph.D. Lecture 20.
Based on slides by Y. Peng University of Maryland
Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.
EMIS 8374 Optimal Trees updated 25 April slide 1 Minimum Spanning Tree (MST) Input –A (simple) graph G = (V,E) –Edge cost c ij for each edge e 
GRAPHS THEROY. 2 –Graphs Graph basics and definitions Vertices/nodes, edges, adjacency, incidence Degree, in-degree, out-degree Subgraphs, unions, isomorphism.
The countable character of uncountable graphs François Laviolette Barbados 2003.
Twenty Years of EPT Graphs: From Haifa to Rostock Martin Charles Golumbic Caesarea Rothschild Institute University of Haifa With thanks to my research.
Chapter 5 Graphs  the puzzle of the seven bridge in the Königsberg,  on the Pregel.
Minimum Spanning Trees CS 146 Prof. Sin-Min Lee Regina Wang.
Lecture 3 1.Different centrality measures of nodes 2.Hierarchical Clustering 3.Line graphs.
******************************************************************** *These notes contain NDSU confidential and proprietary material. * *Patents are pending.
EMIS 8373: Integer Programming Combinatorial Relaxations and Duals Updated 8 February 2005.
Data Structures & Algorithms Graphs Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
Graphs 황승원 Fall 2010 CSE, POSTECH. 2 2 Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects.
GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,
 Quotient graph  Definition 13: Suppose G(V,E) is a graph and R is a equivalence relation on the set V. We construct the quotient graph G R in the follow.
Tree - in “math speak” An ________ graph is a set of vertices/nodes and a set of edges, each edge connects two vertices. Any undirected graph in which.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Mining Complex Data COMP Seminar Spring 2011.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Modular organization.
The countable character of uncountable graphs François Laviolette Barbados 2003.
Graph Algorithms Using Depth First Search
Autumn 2016 Lecture 11 Minimum Spanning Trees (Part II)
Connected Components Minimum Spanning Tree
Autumn 2015 Lecture 11 Minimum Spanning Trees (Part II)
Graph Operations And Representation
Connectivity Section 10.4.
Feodor F. Dragan 1990 Ph.D. in Theoretical Computer Science Institute of Mathematics of the Byelorussian Academy of Science, Minsk Moldova State University.
On (k,l)-Leaf Powers Peter Wagner University of Rostock, Germany
SEG5010 Presentation Zhou Lanjun.
Graphs G = (V, E) V are the vertices; E are the edges.
Winter 2019 Lecture 11 Minimum Spanning Trees (Part II)
GRAPHS.
Autumn 2019 Lecture 11 Minimum Spanning Trees (Part II)
Presentation transcript:

Common Intervals in Sequences, Trees, and Graphs Steffen Heber and Jiangtian Li

Genome Comparison of Bacteria Kim et al., Kim et al., Nat. Biotechnol., 2004]

Gene Order & Function in Bacteria Gene order in bacteria is weakly conserved. [Gene order is not conserved in bacterial evolution. Mushegian, Koonin; Trends Genet. 1996] Some genes cluster together even in unrelated species. Genes inside a cluster are functionally associated. [Conserved clusters of functionally related genes in two bacterial genomes. Tamames et al.; J Mol Evol. 1997]

Gene Order & Function in Bacteria

Formalization of Gene Clusters Genomes: permutations π 1, π 2,…, π k Genes:numbers 1,…,n π1π1 π2π2 π3π3 π4π

Intervals For permutation  of [n] = {1, 2, …, n}, an interval (=gene cluster) is a set {  (i),  (i+1), …,  (j)} for 1  i < j  n. Any permutation of [n] has n(n-1)/2 intervals

Common Intervals For a family F = (  0,  1, …,  k-1 ) of permutations, a common interval of F (=conserved gene cluster) is a subset S  [n], iff S is interval in all  i. We say S  C F 00 11

Common Intervals For a family F = (  0,  1, …,  k-1 ) of permutations, a common interval of F (=conserved gene cluster ) is a subset S   [n], iff S is interval in all  i. We say S  C F 00 11

Common Intervals For a family F = (  0,  1, …,  k-1 ) of permutations, a common interval of F (=conserved gene cluster ) is a subset S  [n], iff S is interval in all  i. We say S  C F 00 11

Lemma Let F = (  0,  1, …,  k-1 ) and c, d  C F. If c  d   then c  d  C F 00 11

Lemma Let F = (  0,  1, …,  k-1 ) and c, d  C F. If c  d   then c  d  C F. We call c  d reducible 00 11 reducible interval irreducible

Analysis We have K  n(n-1)/2 common intervals, and I<n irreducible intervals. Find all K common intervals of k  2 permutations of [n]: O(kn + K) time & O(n) space

Common Intervals of Trees Let T,T 1,…,T k be trees with vertex set [n]. Definition: S  [n] is interval of T iff T[S] connected, and |S|>1 S  [n] is common interval of T 1,…,T k, iff S is interval in all trees. Tree intervals generalize intervals of permutations.

Miscellaneous Example: common intervals of T 1, T 2 : { [2], [3], [4], [5] } (Common) Intervals in trees are induced subtrees T1T T2T2

Structure of Tree Intervals Tree intervals have the Helly property, i.e. for any family of tree intervals (T i ) i  I  the assumption T p  T q  for every p,q  I implies  i  I  T i 

Extreme Cases n-vertex stars S n-1 # non-trivial induced subtrees: 2 n-1 -1

The Common Interval Graph Given T = (T 1,…,T k ) and corresponding common intervals C T. The common interval graph G T = (V,E) is the graph with V = C T E = {(c,d) | c,d  C F, c  d , c  d}

Example V=[n], T=(P n, S n-1 ) We have C T = { [2],[3],…,[n] }, G T = K(C T ). [2] [3] [4] [n] GTGT

Common Interval Graphs cont’d A graph is called chordal, if it does not contain an induced cycle C n on n>3 vertices. Proposition: Common interval graphs of trees are chordal graphs.

Irreducible Common Intervals For a common interval c  C T and a subset V  C T we say that V generates c, iff i.for each d  V, d  c ii.c = Ud iii.G T [V] is connected. If there is no such V then c is irreducible. The irred. intervals generate all common intervals

Finding Irreducible Intervals We have K < 2 n-1 common intervals, and I<n irreducible intervals. Find all irreducible common intervals of k trees on n vertices: O(kn 2 ) time & O(kn) space

Finding Irreducible Intervals Irreducible intervals are minimal common intervals containing an adjacent vertex pair. y x l z m x y l zm y x l z m x y l zm

Graph Intervals G=(V,E), undirected, connected graph, V=[n] S  V is interval (convex), iff the induced subgraph G[S] is connected, and includes every shortest path with end-vertices in S convex NOT!

Common Intervals of Graphs Let G=(G 1,…,G k ) family of connected undirected graphs, with vertex set [n]. Definition: S  [n] is common interval of G, iff S is interval in all graphs. Graph intervals generalize tree intervals G0G0 G1G1

Some Differences The union of convex sets is NOT always convex.

Some Differences 3 21 The common convex hull of an adjacent vertex pair is NOT always irreducible G1G1 G2G2

Finding Irreducible Graph Intervals Sketch: Given G=(G 0, G 1, …, G k-1 ) For each edge (i,j)  E i* do S(i,j) := {i,j} For each (k,l)  S(i,j) Add vertices ‘between’ k and l to S(i,j) Remove reducible intervals

Extreme Cases Permutations (identical permutations): C  n(n-1)/2I < n Trees (identical star-trees): C < 2 n-1 I < n Graphs (complete graphs): C < 2 n I  n(n-1)/2

Example: InterDom Database of protein domain interactions. Gene fusions Protein-protein interactions (DIP & BIND) Protein complexes (PDB)

Comparing Two Networks

Comparing Three Networks G : Gene fusion P : PDB B : BIND D : DIP

Irreducible Intervals size of irreducible interval

Biological Meaningful? RAS family domain protein kinase ankyrin repeat PH domain regulator of chromosome condensation

THANK Y U!!!