C-DEM: A Multi-Modal Query System for Drosophila Embryo Databases Fan Guo, Lei Li, Eric Xing, Christos Faloutsos Carnegie Mellon University {fanguo, leili,

Slides:



Advertisements
Similar presentations
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
Advertisements

Social networks, in the form of bibliographies and citations, have long been an integral part of the scientific process. We examine how to leverage the.
Efficient Distribution Mining and Classification Yasushi Sakurai (NTT Communication Science Labs), Rosalynn Chong (University of British Columbia), Lei.
Design principle of biological networks—network motif.
Mining and Searching Massive Graphs (Networks)
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Fast Query Execution for Retrieval Models based on Path Constrained Random Walks Ni Lao, William W. Cohen Carnegie Mellon University
N EIGHBORHOOD F ORMATION AND A NOMALY D ETECTION IN B IPARTITE G RAPHS Jimeng Sun, Huiming Qu, Deepayan Chakrabarti & Christos Faloutsos Jimeng Sun, Huiming.
Components of a Cell (Eukaryotes) Picture from on-line biology book,on-line biology book,
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, Slides for Chapter 1:
1 Unsupervised Modeling of Object Categories Using Link Analysis Techniques Gunhee Kim Christos Faloutsos Martial Hebert Gunhee Kim Christos Faloutsos.
SCS CMU Proximity Tracking on Time- Evolving Bipartite Graphs Speaker: Hanghang Tong Joint Work with Spiros Papadimitriou, Philip S. Yu, Christos Faloutsos.
Goal: Reconstruct Cellular Networks Biocarta. Conditions Genes.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
1 Fast Dynamic Reranking in Large Graphs Purnamrita Sarkar Andrew Moore.
Progress Report 11/1/01 Matt Bridges. Overview Data collection and analysis tool for web site traffic Lets website administrators know who is on their.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.
Introduction to Bioinformatics - Tutorial no. 12
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Fast Random Walk with Restart and Its Applications
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Information Retrieval in Practice
School of Electronics Engineering and Computer Science Peking University Beijing, P.R. China Ziqi Wang, Yuwei Tan, Ming Zhang.
Network Analysis and Application Yao Fu
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
Master’s Degrees in Bioinformatics in Switzerland: Past, present and near future Patricia M. Palagi Swiss Institute of Bioinformatics.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Using Mixed Length Training Sequences in Transcription Factor Binding Site Detection Tools Nathan Snyder Carnegie Mellon University BioGrid REU 2009 University.
Abstract Background: In this work, a candidate gene prioritization method is described, and based on protein-protein interaction network (PPIN) analysis.
Mining and Querying Multimedia Data Fan Guo Sep 19, 2011 Committee Members: Christos Faloutsos, Chair Eric P. Xing William W. Cohen Ambuj K. Singh, University.
The Simigle Image Search Engine Wei Dong
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Query Suggestion Naama Kraus Slides are based on the papers: Baeza-Yates, Hurtado, Mendoza, Improving search engines by query clustering Boldi, Bonchi,
Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec , HongKong.
The fraction of activating and inhibiting connections controls the dynamics of biological networks Daniel McDonald Rob Knight Meredith Betterton Laura.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Complex Networks: Models Lecture 2 Slides by Panayiotis TsaparasPanayiotis Tsaparas.
Welcome to RPI CS! Theory Group Professors: Mark Goldberg Associate Professors: Daniel Freedman, Mukkai Krishnamoorthy, Malik Magdon- Ismail, Bulent Yener.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
Introduction to biological molecular networks
Presentation: Genetic clustering of social networks using random walks ELSEVIER Computational Statistics & Data Analysis February 2007 Genetic clustering.
Models and Algorithms for Complex Networks Introduction and Background Lecture 1.
System Design for CDEM. Browser-based UI Tomcat Web Server JSP Application Computing Engine QueriesResult Pages Results Remote Function Calls HTTP RMI.
Kijung Shin Jinhong Jung Lee Sael U Kang
KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.
Constructing and Analyzing a Gene Regulatory Network Siobhan Brady UC Davis.
Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Digital Archive page 1 Worzyk Anhalt University of Applied Sciences Digital Archive Storage of pictorial material from the Departments of Design and Architecture.
Purnamrita Sarkar (Carnegie Mellon) Andrew W. Moore (Google, Inc.)
DeepWalk: Online Learning of Social Representations
PEGASUS: A PETA-SCALE GRAPH MINING SYSTEM
Probabilistic Data Management
Recovering Temporally Rewiring Networks: A Model-based Approach
Discrete Kernels.
Large Graph Mining: Power Tools and a Practitioner’s guide
Plum Pudding Models for Growing Small-World Networks
Joining Massive High-Dimensional Datasets
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Multimedia Information Retrieval
COMP9321 Web Application Engineering Semester 2, 2016
GersteinLab.org Overview
Asymmetric Transitivity Preserving Graph Embedding
Learning to Rank Typed Graph Walks: Local and Global Approaches
DATA-Intensive systems Department of computer science
Proximity in Graphs by Using Random Walks
COMP9321 Web Application Engineering Semester 1, 2017
Presentation transcript:

C-DEM: A Multi-Modal Query System for Drosophila Embryo Databases Fan Guo, Lei Li, Eric Xing, Christos Faloutsos Carnegie Mellon University {fanguo, leili, epxing, 1

Background Fruit-fly development in genetic study: – Genes controlling the body plan and patterning organs are similar to higher animals including human. Objective: a framework for applying data mining techniques to assist biological research. 2

The Graph Representation 3 Images Genes Keywords Image-layer edges: nearest neighbors in feature space embryonic hindgut

Proximity Measure Random Walk with Restart – Starting from a node s; – Randomly walk to a neighbor, with probability 1-c; – Restart at s, with probability c; – Compute the steady-state probability vector. – Complexity: O(E), but faster methods exist (Tong et al., ICDM’06) 4

Random Walk with Restart – Starting from a node s – Randomly walk to a neighbor, with probability 1-c – Restart at s, with probability c Proximity Measure

Computing the Steady-State Probability Proximity Measure Desired probability vector Adjacency matrixVector w/ non-zero entry for restart nodes Complexity: O(E), but faster methods exist (Tong et al., ICDM’06)

Multi-Modal Query Results 7 2D Expression Images Genes Annotation Terms

More Mining Tasks Image Auto-Caption Gene function identification 8

Related Work Berkeley Drosophila Genome Project ( FlyExpress ( Berkeley Drosophila Transcription Network Project (bdtnp.lbl.gov)bdtnp.lbl.gov 9

System Architecture 10 Browser-based UI Tomcat Web Server JSP Application Computing Engine QueriesResult Pages Results Remote Function Calls HTTP RMI