Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

Slides:



Advertisements
Similar presentations
Real-Time Template Tracking
Advertisements

A Robust Super Resolution Method for Images of 3D Scenes Pablo L. Sala Department of Computer Science University of Toronto.
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
BiG-Align: Fast Bipartite Graph Alignment
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
School of Computer Science Carnegie Mellon University National Taiwan University of Science & Technology Unifying Guilt-by-Association Approaches: Theorems.
The Connectivity and Fault-Tolerance of the Internet Topology
Fast intersection kernel SVMs for Realtime Object Detection
Fast Query Execution for Retrieval Models based on Path Constrained Random Walks Ni Lao, William W. Cohen Carnegie Mellon University
N EIGHBORHOOD F ORMATION AND A NOMALY D ETECTION IN B IPARTITE G RAPHS Jimeng Sun, Huiming Qu, Deepayan Chakrabarti & Christos Faloutsos Jimeng Sun, Huiming.
Fast Direction-Aware Proximity for Graph Mining KDD 2007, San Jose Hanghang Tong, Yehuda Koren, Christos Faloutsos.
Yuan Yao Joint work with Hanghang Tong, Feng Xu, and Jian Lu Predicting Long-Term Impact of CQA Posts: A Comprehensive Viewpoint 1 Aug 24-27, KDD 2014.
SCS CMU Joint Work by Hanghang Tong, Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos Speaker: Hanghang Tong Aug , 2008, Las Vegas.
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
© 2011 IBM Corporation IBM Research SIAM-DM 2011, Mesa AZ, USA, Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection Hanghang.
© 2010 IBM Corporation Diversified Ranking on Large Graphs: An Optimization Viewpoint Hanghang Tong, Jingrui He, Zhen Wen, Ching-Yung Lin, Ravi Konuru.
Vector Space Information Retrieval Using Concept Projection Presented by Zhiguo Li
Iterative Set Expansion of Named Entities using the Web Richard C. Wang and William W. Cohen Language Technologies Institute Carnegie Mellon University.
SCS CMU Proximity Tracking on Time- Evolving Bipartite Graphs Speaker: Hanghang Tong Joint Work with Spiros Papadimitriou, Philip S. Yu, Christos Faloutsos.
Scaling Personalized Web Search Glen Jeh, Jennfier Widom Stanford University Presented by Li-Tal Mashiach Search Engine Technology course (236620) Technion.
Measure Proximity on Graphs with Side Information Joint Work by Hanghang Tong, Huiming Qu, Hani Jamjoom Speaker: Mary McGlohon 1 ICDM 2008, Pisa, Italy15-19.
Fast Random Walk with Restart and Its Applications
SCS CMU Joint Work by Hanghang Tong, Yasushi Sakurai, Tina Eliassi-Rad, Christos Faloutsos Speaker: Hanghang Tong Oct , 2008, Napa, CA CIKM 2008.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P3-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 3: Recommendations & proximity Faloutsos,
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
Clustering Vertices of 3D Animated Meshes
School of Electronics Engineering and Computer Science Peking University Beijing, P.R. China Ziqi Wang, Yuwei Tan, Ming Zhang.
School of Computer Science Carnegie Mellon LLNL, Feb. '07C. Faloutsos1 Mining static and time-evolving graphs Christos Faloutsos Carnegie Mellon University.
School of Computer Science Carnegie Mellon University National Taiwan University of Science & Technology Unifying Guilt-by-Association Approaches: Theorems.
Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.
Bayesian Sets Zoubin Ghahramani and Kathertine A. Heller NIPS 2005 Presented by Qi An Mar. 17 th, 2006.
Random Walk with Restart (RWR) for Image Segmentation
2015/10/111 DBconnect: Mining Research Community on DBLP Data Osmar R. Zaïane, Jiyang Chen, Randy Goebel Web Mining and Social Network Analysis Workshop.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
PMLAB Finding Similar Image Quickly Using Object Shapes Heng Tao Shen Dept. of Computer Science National University of Singapore Presented by Chin-Yi Tsai.
Mining and Querying Multimedia Data Fan Guo Sep 19, 2011 Committee Members: Christos Faloutsos, Chair Eric P. Xing William W. Cohen Ambuj K. Singh, University.
KDD 2007, San Jose Fast Direction-Aware Proximity for Graph Mining Speaker: Hanghang Tong Joint work w/ Yehuda Koren, Christos Faloutsos.
1/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science
Efficient Route Computation on Road Networks Based on Hierarchical Communities Qing Song, Xiaofan Wang Department of Automation, Shanghai Jiao Tong University,
SCS CMU Proximity on Large Graphs Speaker: Hanghang Tong Guest Lecture.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,
1 Panther: Fast Top-K Similarity Search on Large Networks Jing Zhang 1, Jie Tang 1, Cong Ma 1, Hanghang Tong 2, Yu Jing 1, and Juanzi Li 1 1 Department.
Progress Report (Concept Extraction) Presented by: Mohsen Kamyar.
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
KDD 2007, San Jose Fast Direction-Aware Proximity for Graph Mining Speaker: Hanghang Tong Joint work w/ Yehuda Koren, Christos Faloutsos.
Panther: Fast Top-k Similarity Search in Large Networks JING ZHANG, JIE TANG, CONG MA, HANGHANG TONG, YU JING, AND JUANZI LI Presented by Moumita Chanda.
Single-Pass Belief Propagation
Kijung Shin Jinhong Jung Lee Sael U Kang
Estimating PageRank on Graph Streams Atish Das Sarma (Georgia Tech) Sreenivas Gollapudi, Rina Panigrahy (Microsoft Research)
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University.
SCS CMU Speaker Hanghang Tong Colibri: Fast Mining of Large Static and Dynamic Graphs Speaking Skill Requirement.
Extrapolation to Speed-up Query- dependent Link Analysis Ranking Algorithms Muhammad Ali Norozi Department of Computer Science Norwegian University of.
Image Retrieval and Ranking using L.S.I and Cross View Learning Sumit Kumar Vivek Gupta
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Large Graph Mining: Power Tools and a Practitioner’s guide
Compressive Coded Aperture Video Reconstruction
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
Predicting Long-Term Impact of CQA Posts: A Comprehensive Viewpoint
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
Large Graph Mining: Power Tools and a Practitioner’s guide
CMSC 635 Ray Tracing.
Speaker: Hanghang Tong Carnegie Mellon University
Jinhong Jung, Woojung Jin, Lee Sael, U Kang, ICDM ‘16
Pramod Bhatotia, Ruichuan Chen, Myungjin Lee
Asymmetric Transitivity Preserving Graph Embedding
Learning to Rank Typed Graph Walks: Local and Global Approaches
Proximity in Graphs by Using Random Walks
Presentation transcript:

Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec , HongKong

2 Motivating Questions Q: How to measure the relevance? A: Random walk with restart Q: How to do it efficiently? A: This talk tries to answer!

3 Random walk with restart

Random walk with restart

Random walk with restart

6 Node 4 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node Ranking vector

7 Automatic Image Caption [Pan KDD04] Text Image Region Test Image Jet Plane Runway Candy Texture Background

8 Neighborhood Formulation [Sun ICDM05]

9 Center-Piece Subgraph [Tong KDD06]

10 Other Applications Content-based Image Retrieval Personalized PageRank Anomaly Detection (for node; link) Link Prediction [Getoor], [Jensen], … Semi-supervised Learning …. [Put Authors]

11 Roadmap Background –RWR: Definitions –RWR: Algorithms Basic Idea FastRWR –Pre-Compute Stage –On-Line Stage Experimental Results Conclusion

12 Computing RWR n x n n x 1 Ranking vector starting vector Adjacent matrix Q: Given e i, how to solve? 1

OntheFly: No pre-computation/ light storage Slow on-line response O(mE)

14 PreCompute: Fast on-line response Heavy pre-computation/storage cost O(n^3) O(n^2)

15 Q: How to Balance? On-line Off-line

16 Roadmap Background –RWR: Definitions –RWR: Algorithms Basic Idea FastRWR –Pre-Compute Stage –On-Line Stage Experimental Results Conclusion

Basic Idea Find Community Fix the remaining Combine

18 Basic Idea: Pre-computational stage A few small, instead of ONE BIG, matrices inversions U V Q-matrices Link matrices +

19 Basic Idea: On-Line Stage A few, instead of MANY, matrix-vector multiplication U V + + Query Result

20 Roadmap Background Basic Idea FastRWR –Pre-Compute Stage –On-Line Stage Experimental Results Conclusion

21 Pre-compute Stage p1: B_Lin Decomposition –P1.1 partition –P1.2 low-rank approximation p2: Q matrices –P2.1 computing (for each partition) –P2.2 computing (for concept space)

22 P1.1: partition Within-partition linkscross-partition links

23 P1.1: block-diagonal

24 P1.2: LRA for U VS

c3 c1 c4 c U VS +

26 p2.1 Computing

27 Comparing and Computing Time –100,000 nodes; 100 partitions –Computing 100,00x is Faster! Storage Cost (100x saving!)

28 p2.2 Computing: U V = _

29 SM Lemma says: We have: U V Q-matricies Link matrices

30 Roadmap Background Basic Idea FastRWR –Pre-Compute Stage –On-Line Stage Experimental Results Conclusion

31 On-Line Stage Q + Query Result ? U V + A (SM lemma)

32 On-Line Query Stage q1: q2: q3: q4: q5: q6:

33 + (1-c) c q1: Find the community q2-q5: Compensate out-community Links q6: Combine

34 Example We have U V + we want to:

35 q1:Find Community q1:

36 q2-q5: out-community q2: q3: q4:

37 q6: Combination q6: =

38 Roadmap Background Basic Idea FastRWR –Pre-Compute Stage –On-Line Stage Experimental Results Conclusion

39 Experimental Setup Dataset –DBLP/authorship –Author-Paper –315k nodes –1,800k edges Quality: Relative Accuracy Application: Center-Piece Subgraph

40 Query Time vs. Pre-Compute Time Log Query Time Log Pre-compute Time

41 Query Time vs. Pre-Storage Log Query Time Log Storage

42 Several orders save in pre-storage/computation Up to 150x faster response 90%+ quality preserving Log Storage quality Log Pre-compute quality Log Query Time

43 Roadmap Background Basic Idea FastRWR –Pre-Compute Stage –On-Line Stage Experimental Results Conclusion

44 Conclusion FastRWR –Reasonable quality preservation (90%+) –150x speed-up: query time –Orders of magnitude saving: pre-compute & storage More in the paper –The variant of FastRWR and theoretic justification –Implementation details normalization, low-rank approximation, sparse –More experiments Other datasets, other applications

45 Q&A Thank you!

46 Future work Incremental FastRWR Paralell FastRWR –Partition –Q-matraces for each partition Hierarchical FastRWR –How to compute one Q-matrix for

47 Possible Q? Why RWR?