Detecting active subnetworks in molecular interaction networks with missing data Luke Hunter Texas A&M University SHURP 2007 Student.

Slides:



Advertisements
Similar presentations
Outlines Background & motivation Algorithms overview
Advertisements

Biological pathway and systems analysis An introduction.
Putting genetic interactions in context through a global modular decomposition Jamal.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.
KEGG: Kyoto Encyclopedia of Genes and Genomes Susan Seo Intro to Bioinformatics Fall 2004.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Cs726 Modeling regulatory networks in cells using Bayesian networks Golan Yona Department of Computer Science Cornell University.
Functional genomics and inferring regulatory pathways with gene expression data.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
Network motifs: discovery and applications Guy Zinman Seminar in Bioinformatics Technion, Spring 2005.
Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.
Clustering (Part II) 10/07/09. Outline Affinity propagation Quality evaluation.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
Systems Biology Biological Sequence Analysis
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Epistasis Analysis Using Microarrays Chris Workman.
Inferring subnetworks from perturbed expression profiles Dana Pe’er, Aviv Regev, Gal Elidan and Nir Friedman Bioinformatics, Vol.17 Suppl
Modeling Functional Genomics Datasets CVM Lessons 4&5 10 July 2007Bindu Nanduri.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
ChIP-seq and its applications in GRN construction Jin Chen 2012 Fall CSE
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Gene Set Enrichment Analysis (GSEA)
Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature : Presented by Monica C. Sleumer February.
The importance of enzymes and their occurrences: from the perspective of a network W.C. Liu 1, W.H. Lin 1, S.T. Yang 1, F. Jordan 2 and A.J. Davis 3, M.J.
Jesse Gillis 1 and Paul Pavlidis 2 1. Department of Psychiatry and Centre for High-Throughput Biology University of British Columbia, Vancouver, BC Canada.
Networks and Interactions Boo Virk v1.0.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Reconstructing gene networks Analysing the properties of gene networks Gene Networks Using gene expression data to reconstruct gene networks.
Network & Systems Modeling 29 June 2009 NCSU GO Workshop.
Introduction to Bioinformatics Biological Networks Department of Computing Imperial College London March 18, 2010 Lecture hour 18 Nataša Pržulj
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
Complementarity of network and sequence information in homologous proteins March, Department of Computing, Imperial College London, London, UK 2.
Analysis of GO annotation at cluster level by Agnieszka S. Juncker.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
By: Amira Djebbari and John Quackenbush BMC Systems Biology 2008, 2: 57 Presented by: Garron Wright April 20, 2009 CSCE 582.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
Introduction to biological molecular networks
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
CSC, Dec.15-16,2005. Cytoscape Team Trey Ideker Mark Anderson Nerius Landys Ryan Kelley Chris Workman Past contributors: Nada Amin Owen Ozier Jonathan.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
A comparative approach for gene network inference using time-series gene expression data Guillaume Bourque* and David Sankoff *Centre de Recherches Mathématiques,
Importing KEGG pathway and mapping custom node graphics on Cytoscape Kozo Nishida Keiichiro Ono Cytoscape retreat 2010 University of Michigan Jul 18, 2010.
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
RDF based on Integration of Pathway Database and Gene Ontology SNU OOPSLA LAB DongHyuk Im.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Journal club Jun , Zhen.
Networks and Interactions
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
Dynamics and context-specificity in biological networks
System Structures Identification
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Schedule for the Afternoon
Dynamics and context-specificity in biological networks
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Principle of Epistasis Analysis
Presentation transcript:

Detecting active subnetworks in molecular interaction networks with missing data Luke Hunter Texas A&M University SHURP 2007 Student

Outline of Talk Introduction Overall Strategy Previous Papers Graph Construction Scoring Function Search Approaches Experiments Future Work

Introduction Background: Ideker et al. define an ‘active subnetwork’ as a connected set of genes with unexpectedly high levels of differential expression Objective: Find active subnetworks of metabolites Motivation: High throughput data analysis Mechanisms Cell state (disease, drug treatment, and environment)

Overall Strategy 1) Build graph 2) Obtain data (p-values) 3) Create scoring function 4) Find high-scoring subsets 5) Validate results

Previous Papers (1): Ideker et al. (2002) “Discovering regulatory and signalling circuits in molecular interaction networks” Goal: find active subnetworks Graph Galactose utilization (~300 nodes, ~300 links) P-P & P-DNA for yeast (~4000 nodes, ~7500 links) Data from perturbations of GAL pathway Scoring Aggregate z-score & calibration (more later) Scoring over multiple conditions Searching Simulated Annealing Results Don’t contradict literature Breaks up / organizes data

Previous Papers (2): Rajagopalan & Agarwal (2004) Goal: maximally include query list in minimal subset Graph Gathered data from 3 sources (~9000 nodes, ~30,000 links) Scoring Used aggregate z-score & calibration (from Ideker, 2002) Modified to consider node degree and node significance Searching Greedy Algorithm with DFS Results Experiments are not convincing “Inferring pathways from gene lists using a literature-derived _network of biological relationships”

Graph Construction KEGG Data (Kanehisa et al.) Nodes: ligands (i.e.--compounds, glycans, & drugs; ~25,000) Links: reactions (~29,000) Measured Data Chronic ischemia (304 ligands) Glucose tolerance (124 ligands) Planned myocardial infarction (107 ligands) Problems with measured data Ambiguity Not in KEGG Duplicates

Scoring Functions (1) Naïve Ideker et al. (2002)Whitlock (2005) Rajagopalan & Agarwal (2004) Use aggregate z-score of Ideker Create “corrected” node score Modify for node significance Modify for node degree Discrepancy with Ideker paper

Scoring Functions (2) Significance vs. Strength Geometric MeanPiecewise FunctionWeighted Geometric Mean

Scoring Functions (3) Establish Significance of Scores 1) Scramble 2) Search 3) Obtain distribution

Search Approaches (1): Simulated Annealing Ideker et al. 2002

Search Approaches (2): Greedy Algorithm w/ DFS 1)Build graph and calculate corrected node scores 2)Use BFS to group nodes with positive corrected scores 3)For each connected component do a limited DFS and try to merge with nearby connected components if merge would increase the overall score 4)Prune nodes with small z-scores (so long as connectivity is maintained)

Algorithm Test

Future Goals Remove “distant” unknown nodes? Evaluate scoring functions Evaluate search strategies Implement Google MapReduce Apply to more data sets Use cytoscape software

Acknowledgements NSF REU Program Fritz Gabriel Everyone else

References Ideker, T., Ozier, O., Schwikowski, B., and Siegel, A.F Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18: S233–S240. Rajagopalan, D., & Agarwal, P. (2005). Inferring pathways from gene lists using a literature-derived network of biological relationships. Bioinformatics 21, 788– 793. Whitlock, M. (2005). Combining probability from independent tests: the weighted Z-method is superior to Fisher’s approach. J. Evol. Biol. 16, Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K.F., Itoh, M., Kawashima, S., Katayama, T., Araki, M., and Hirakawa, M.; From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, D (2006). Dean, J., & Ghemawat, S. (2004). MapReduce: Simplified Data Processing on Large Clusters. OSDI 2004.

Questions?