Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
BiG-Align: Fast Bipartite Graph Alignment
Advertisements

On the Vulnerability of Large Graphs
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
School of Computer Science Carnegie Mellon University National Taiwan University of Science & Technology Unifying Guilt-by-Association Approaches: Theorems.
Mauro Sozio and Aristides Gionis Presented By:
In Search of Influential Event Organizers in Online Social Networks
© 2012 IBM Corporation IBM Research Gelling, and Melting, Large Graphs by Edge Manipulation Joint Work by Hanghang Tong (IBM) B. Aditya Prakash (Virginia.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Lab 2 Lab 3 Homework Labs 4-6 Final Project Late No Videos Write up
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
Fast Query Execution for Retrieval Models based on Path Constrained Random Walks Ni Lao, William W. Cohen Carnegie Mellon University
N EIGHBORHOOD F ORMATION AND A NOMALY D ETECTION IN B IPARTITE G RAPHS Jimeng Sun, Huiming Qu, Deepayan Chakrabarti & Christos Faloutsos Jimeng Sun, Huiming.
CMU SCS Mining Billion-node Graphs Christos Faloutsos CMU.
Fast Direction-Aware Proximity for Graph Mining KDD 2007, San Jose Hanghang Tong, Yehuda Koren, Christos Faloutsos.
SCS CMU Joint Work by Hanghang Tong, Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos Speaker: Hanghang Tong Aug , 2008, Las Vegas.
CMU SCS KDD 2006Leskovec & Faloutsos1 ??. CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos.
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
© 2003 by Davi GeigerComputer Vision October 2003 L1.1 Image Segmentation Based on the work of Shi and Malik, Carnegie Mellon and Berkley and based on.
Localized Techniques for Power Minimization and Information Gathering in Sensor Networks EE249 Final Presentation David Tong Nguyen Abhijit Davare Mentor:
HCS Clustering Algorithm
© 2011 IBM Corporation IBM Research SIAM-DM 2011, Mesa AZ, USA, Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection Hanghang.
© 2010 IBM Corporation Diversified Ranking on Large Graphs: An Optimization Viewpoint Hanghang Tong, Jingrui He, Zhen Wen, Ching-Yung Lin, Ravi Konuru.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
Iterative Set Expansion of Named Entities using the Web Richard C. Wang and William W. Cohen Language Technologies Institute Carnegie Mellon University.
SCS CMU Proximity Tracking on Time- Evolving Bipartite Graphs Speaker: Hanghang Tong Joint Work with Spiros Papadimitriou, Philip S. Yu, Christos Faloutsos.
Presented by Ozgur D. Sahin. Outline Introduction Neighborhood Functions ANF Algorithm Modifications Experimental Results Data Mining using ANF Conclusions.
C-DEM: A Multi-Modal Query System for Drosophila Embryo Databases Fan Guo, Lei Li, Eric Xing, Christos Faloutsos Carnegie Mellon University {fanguo, leili,
Measure Proximity on Graphs with Side Information Joint Work by Hanghang Tong, Huiming Qu, Hani Jamjoom Speaker: Mary McGlohon 1 ICDM 2008, Pisa, Italy15-19.
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Fast Random Walk with Restart and Its Applications
SCS CMU Joint Work by Hanghang Tong, Yasushi Sakurai, Tina Eliassi-Rad, Christos Faloutsos Speaker: Hanghang Tong Oct , 2008, Napa, CA CIKM 2008.
Query Planning for Searching Inter- Dependent Deep-Web Databases Fan Wang 1, Gagan Agrawal 1, Ruoming Jin 2 1 Department of Computer.
Mining Large Graphs Part 3: Case studies Jure Leskovec and Christos Faloutsos Machine Learning Department Joint work with: Lada Adamic, Deepay Chakrabarti,
School of Computer Science Carnegie Mellon LLNL, Feb. '07C. Faloutsos1 Mining static and time-evolving graphs Christos Faloutsos Carnegie Mellon University.
Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.
Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
KDD 2007, San Jose Fast Direction-Aware Proximity for Graph Mining Speaker: Hanghang Tong Joint work w/ Yehuda Koren, Christos Faloutsos.
SCS CMU Proximity on Large Graphs Speaker: Hanghang Tong Guest Lecture.
Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec , HongKong.
1 Panther: Fast Top-K Similarity Search on Large Networks Jing Zhang 1, Jie Tang 1, Cong Ma 1, Hanghang Tong 2, Yu Jing 1, and Juanzi Li 1 1 Department.
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
ECML-PKDD 2010, Barcelona, Spain B. Aditya Prakash*, Hanghang Tong* ^, Nicholas Valler+, Michalis Faloutsos+, Christos Faloutsos* * Carnegie Mellon University,
KDD 2007, San Jose Fast Direction-Aware Proximity for Graph Mining Speaker: Hanghang Tong Joint work w/ Yehuda Koren, Christos Faloutsos.
Project funded by the Future and Emerging Technologies arm of the IST Programme Are Proliferation Techniques more efficient than Random Walk with respect.
Tools for large graph mining WWW 2008 tutorial Part 4: Case studies Jure Leskovec and Christos Faloutsos Machine Learning Department Joint work with: Lada.
Kijung Shin Jinhong Jung Lee Sael U Kang
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
1 CS270 Project Overview Maximum Planar Subgraph Danyel Fisher Jason Hong Greg Lawrence Jimmy Lin.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
SCS CMU Speaker Hanghang Tong Colibri: Fast Mining of Large Static and Dynamic Graphs Speaking Skill Requirement.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Arizona State University Fast Eigen-Functions Tracking on Dynamic Graphs Chen Chen and Hanghang Tong - 1 -
CS 326A: Motion Planning Probabilistic Roadmaps for Path Planning in High-Dimensional Configuration Spaces (1996) L. Kavraki, P. Švestka, J.-C. Latombe,
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
Kijung Shin1 Mohammad Hammoud1
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
Large Graph Mining: Power Tools and a Practitioner’s guide
Jinhong Jung, Woojung Jin, Lee Sael, U Kang, ICDM ‘16
Graph Indexing for Shortest-Path Finding over Dynamic Sub-Graphs
Bidirectional Query Planning Algorithm
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Learning to Rank Typed Graph Walks: Local and Global Approaches
Proximity in Graphs by Using Random Walks
Presentation transcript:

Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University

2 Center-Piece Subgraph(Ceps) Given Q query nodes Find Center-piece ( ) Input of Ceps –Q Query nodes –Budget b –K softand coefficient App. –Social Network –Law Inforcement –Gene Network –…

3 Challenges in Ceps Q1: How to measure the importance? Q2: How to extract connection subgraph? Q3: How to do it efficiently?

4 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: Extract Alg. Q3: Efficiency Issue Experimental Results Conclusion

5 Ceps Overview Individual Score Calculation –Measure importance wrt individual query Combine Individual Scores –Measure importance wrt query set “Extract” Alg. – … the connection subgraphs

6 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Issue Experimental Results Conclusion

7 RWR: Individual Score Calculation Goal –Individual importance score r(i,j) = r i,j –For each node j wrt each query i How to –Random walk with restart –Steady State Prob.

8 An Illustrating Example Starting from 1 Randomly to neighbor Some p to return to 1 Prob (RW will finally stay at j)

9 Individual Score Calculation Q1 Q2Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node

10 Individual Score Calculation Q1 Q2Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node

11 AND: Combine Scores Q: How to combine scores? A: Multiply …= prob. 3 random particles coincide on node j

12 K_SoftAnd: Combine Scores Generalization – SoftAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that?

13 K_SoftAnd: Combine Scores Generalization – softAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that? A: Prob(at least k-out- of-Q will meet each other at j)

14 K_SoftAnd: Relaxation of AND Asking AND query?  No Answer! Disconnected Communities Noise

15 K_SoftAnd: Combine Score Goal –Importance wrt query set –Depend on query scenario! How to… –Meeting Probability –K_SoftAnd –Prob(at least k-out-of-Q will meet each other at j)

16 AND query vs. K_SoftAnd query And Query 2_SoftAnd Query x 1e-4

17 1_SoftAnd query = OR query

18 Measuring Importance Q1 Q2Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node OR K_SoftAnd Random walk with restart And 2_SoftAnd Individual Scores Combining Scores Steady State Prob Meeting Prob

19 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Experimental Results Conclusion

20 Goal –Maximize total scores and –‘Appropriate’ Connections How to…”Extract” Alg. –Dynamic Programming –Greedy Alg. Pickup promising node Find ‘best’ path “Extract” Alg

21 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Experimental Results Conclusion

22 Graph Partition: Efficiency Issue Straightforward way –Q linear system: –linear to # of edge Observation –Skewed dist. How to… –Graph partition

23 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Issue Experimental Results Conclusion

24 Experimental Setup Dataset –DBLP/authorship –Author-Paper –315k nodes –1,800k edges Evaluation Criteria –I Node Ratio –I Edge Ratio

25 Experimental Setup We want to check –Does the goodness criteria make sense? –Does “extract” alg. capture most of important nodes/edge? –Efficiency

26 Case Study: AND query

27 2_SoftAnd query Statistic database

28 Evaluation of “Extract” Alg. 20 nodes 90%+ preserved Budget (b) Node Ratio 2 query nodes 3 query nodes

29 Running Time vs. Quality for Fast Ceps Running Time Quality ~90% quality 6:1 speedup

30 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Issue Experimental Results Conclusion

31 Conclusion Q1:How to measure the importance? A1: RWR+K_SoftAnd Q2: How to find connection subgraph? A2:”Extract” Alg. Q3:How to do it efficiently? A3:Graph Partition (Fast Ceps) –~90% quality –6:1 speedup

32 Q&A Thank you!