Dynamic Networks: How Networks Change with Time? Vahid Mirjalili CSE 891.

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

Clustering II.
BioInformatics (3).
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Microarray Data Analysis (Lecture for CS397-CXZ Algorithms in Bioinformatics) March 19, 2004 ChengXiang Zhai Department of Computer Science University.
Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
Presented by Yuhua Jiao Outline Limitation of some network clustering methods Hierarchical Agglomerative Clustering – Method – Performance.
CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.
Introduction of Probabilistic Reasoning and Bayesian Networks
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
Structural Inference of Hierarchies in Networks BY Yu Shuzhi 27, Mar 2014.
Open Day 2006 From Expression, Through Annotation, to Function Ohad Manor & Tali Goren.
Work Process Using Enrich Load biological data Check enrichment of crossed data sets Extract statistically significant results Multiple hypothesis correction.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
CISC667, F05, Lec26, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Genetic networks and gene expression data.
MOPAC: Motif-finding by Preprocessing and Agglomerative Clustering from Microarrays Thomas R. Ioerger 1 Ganesh Rajagopalan 1 Debby Siegele 2 1 Department.
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Global topological properties of biological networks.
Predicting protein functions from redundancies in large-scale protein interaction networks Speaker: Chun-hui CAI
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
A scalable multilevel algorithm for community structure detection
Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Generating Robust and Consensus Clusters from Gene Expression Data Allan Tucker a, Stephen Swift a, Xiaohui Liu a, Nigel Martin b, Christine Orengo c,
Fuzzy K means.
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
A Sparsification Approach for Temporal Graphical Model Decomposition Ning Ruan Kent State University Joint work with Ruoming Jin (KSU), Victor Lee (KSU)
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
Inferring subnetworks from perturbed expression profiles Dana Pe’er, Aviv Regev, Gal Elidan and Nir Friedman Bioinformatics, Vol.17 Suppl
Protein Classification A comparison of function inference techniques.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Dependency networks Sushmita Roy BMI/CS 576 Nov 26 th, 2013.
CHAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
BINF6201/8201 Molecular phylogenetic methods
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.
Handover and Tracking in a Camera Network Presented by Dima Gershovich.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Lecture 3 1.Different centrality measures of nodes 2.Hierarchical Clustering 3.Line graphs.
Extracting binary signals from microarray time-course data Debashis Sahoo 1, David L. Dill 2, Rob Tibshirani 3 and Sylvia K. Plevritis 4 1 Department of.
Introduction to biological molecular networks
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
An unsupervised conditional random fields approach for clustering gene expression time series Chang-Tsun Li, Yinyin Yuan and Roland Wilson Bioinformatics,
Clustering Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Hierarchical Agglomerative Clustering on graphs
Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001
Recovering Temporally Rewiring Networks: A Model-based Approach
Finding modules on graphs
Analyzing Time Series Gene Expression Data
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Volume 22, Issue 3, Pages (January 2018)
Predicting Gene Expression from Sequence
Presentation transcript:

Dynamic Networks: How Networks Change with Time? Vahid Mirjalili CSE 891

Overview Introduction Methodology –DHAC: clustering in a single snapshot – MATH-EM: Cluster matching in different time frames Results Discussion Further improvement

Motivation To infer the dynamic state of a cell in response to physiological changes Two algorithms used:  DHAC: Dynamic Hierarchal Agglomerative Clustering for clustering time-evolving networks  MATH-EM: for matching corresponding clusters across time-points

Background Current biological networks are static Experimental methods:  Protein abundance (mass spec.) (mainly available for high abundant proteins)  Transcript abundance (more readily available) Previous works: combining transcript abundance and interaction networks to create a moving cell

Dynamic Networks Probabilistic framework The number of proteins can increase or decrease at each time-point Protein can switch interacting partners Complexes can grow/shrink Reveals temporal regulation of cell protein state

HAC: Hierarchal Agglomerative Clustering Agglomerative = “bottom up” approach Divisive = “top down” approach

HAC Features Maximizes the likelihood of a hierarchal stochastic block model Automatic selection of model size Multi-scale networks Outperforms other methods in link prediction Extending HAC to dynamic networks: How complexes inferred at one time point correspond to other time points Transitions of a protein require dynamic coupling between network snapshots

DHAC: Converting likelihood modularity from maximum likelihood to fully Bayesian statistics Kernelize likelihood modularity with an adaptive bandwidth to couple network clusters at different time points

Dynamic Network Clustering {G(t) = (V(t), E(t)), t= 1.. T} V: proteins E: (undirected, unweighted) protein-protein interactions Goal: find the stochastic block models {M(t) t=1.. T} M(t): network generative model for G(t) Introducing coupling between time points improves dynamic network clustering

DHAC: notations probability of a structure model M The probability that a vertex is in cluster k

Merging Clusters To merging clusters 1 &2 into 1’:  Maximum likelihood  Bayesian

Kernelization Kernel reweighting: to couple nearby snapshots

DHAC Algorithm for t=1:T do Set each vertex to be a single cluster Let be cumulative model comparison score Compute merging scores of pairs having an edge or a shared neighbor repeat Pick a pair i,j of maximum Update scores of affected pairs after merging i,j Merge i,j to i' Compute merging scores i',j for all j with or Update until no pairs left output at which was maximum end for

Cluster Matching Algorithm Searching through time-frames to see how complexes evolve Goal: to find the most probable matching of cluster i to a global index k

Results ● Drosophila development (gene expression data available) DHAC-local: variable bandwidth DHAC-const: constant bandwidth

Yeast Metabolic Cycle

Yeast Results Yeast results identify protein complexes with asynchronous gene expression 31 dynamic protein complexes were recovered Many of the complexes have cluster- specific gene-ontology with P-value<0.05 Some of the complexes disappear and then reappear across time-points

Discussion DHAC scales as O(EJ ln(V)) Networks with 2000 vertices take up to 5 min. A full genome network (10000 to vertices) can be analyzed in a day or a week This methods permits proteins to switch between complexes over time A natural multi-scale complexes, sub- complexes and proteins

Further improvement Information from pathway to complex to sub-complex to finer structures could be used Lack a method to match the dynamically evolving hierarchical structures over snapshots They only focused on the bottom level complexes, rather than the hierarchical structure

MATCH-EM Goal: Match similar groups across time- points Find the mapping of each cluster to a global index There is one and only one global index for cluster i The probability that vertex u is in global index k The assignment matrix

The matching probability under consistent indexing Number of shared vertices between cluster i at time t, and cluster j at time t+1 Probability that a vertex can make a transition from k to k’ between two consecutive snapshots

Update:

Experimental Data Combining Gene expression time series with static protein interaction networks The presence of a protein is assumed to be related to the transcriptional abundance of the corresponding transcript at a nearby time N x T matrix: transcription levels of N genes across T time points The dynamics of the networks is generated from the transcription matrix, under the assuming that proteins in a complex have correlated gene expression profiles

Results: Held-out link prediction Randomly select two vertices, and remove the edge After clustering, vertex u is assigned to group i, and vertex v to cluster j The maximum likelihood probability that u- v were connected:

AUPRC: area under the curve of Precision-Recall-Curve AUROC: area under the curve of receiver-operating-characteristics (generated by true-positive-rate and false-positive-rate)

Yeast Metabolic Cycle Three dominant metabolic states: 1.Reductive Building: 977 genes RB 2.Reductive Charging: 1510 genes RC 3.Oxidative: 1023 genes OX 36 snapshots Preprocessing: iterative degree cutoff, reducing the number of proteins from 1380 to 480±14

Macro-view of YMC RB phase OX phase RC phase

Micro-views of YMC dynamics Cluster #7: mitochondrial ribosome complex 1.RSMs: ribosomal small subunits of mitochondria 2.MRPs: mitochondrial ribosomal proteins RSM22 is active at t=9, 20 & 32, while other proteins are not transcribed Methylation of 3’-end of rRNA of small mitochondrial subunit is requred for the assembly and stability of mitochindrial ribosome Deleting RSM22 yields a viable cell with non-functional mitochondria Hypothesis Early expression of RSM22 provide the methylation activity required for the assembly of small sub-units of mitochondrial ribosome

Cluster #7: mitochondrial ribosomal complex Average expression levels during the three main phases

Cluster #16: nuclear pore Active at t=9, 20 & 32 Most genes are OX-responsive Combines with subunits of other complexes The co-expressed cores: –Nuclear pore complex (NPC) –Karyopherin proteins (KAP) Micro-views of YMC dynamics

Cluster #16: nuclear pore complex During OX phase, SRP1 and SXM1 Are additionally recruited

What we learned from YMC? RRP4 and RRP42 are part of exosome that edit RNA molecules, they transition between the nuclear pore and other complexes RNA processing is tightly coupled to transport through the nuclear pore to cytoplasm Dynamic reorganization of the nuclear pore occurs during the metabolic cycle