Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Random Walk with Restart and Its Applications

Similar presentations


Presentation on theme: "Fast Random Walk with Restart and Its Applications"— Presentation transcript:

1 Fast Random Walk with Restart and Its Applications
Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM Dec , HongKong ICDM2006 Dec, 18-22, HongKong

2 Motivating Questions Q: How to measure the relevance?
A: Random walk with restart Q: How to do it efficiently? A: This talk tries to answer! ICDM2006 Dec, 18-22, HongKong

3 Random walk with restart
1 4 3 2 5 6 7 9 10 8 11 12 ICDM2006 Dec, 18-22, HongKong

4 Random walk with restart
1 4 3 2 5 6 7 9 10 8 11 12 0.13 0.10 0.05 0.08 0.04 0.02 0.03 Node 4 Node 1 Node 2 Node 3 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 0.13 0.10 0.22 0.05 0.08 0.04 0.03 0.02 Nearby nodes, higher scores Ranking vector More red, more relevant ICDM2006 Dec, 18-22, HongKong

5 Automatic Image Caption
Q { } Cat Forest Grass Tiger { Sea Sun Sky Wave } ? {?, ?, ?,} A: RWR! [Pan KDD2004] ICDM2006 Dec, 18-22, HongKong

6 Region Image Keyword Test Image Sea Sun Sky Wave Cat Forest Tiger
Grass Test Image Keyword ICDM2006 Dec, 18-22, HongKong

7 Region Image Keyword Test Image {Grass, Forest, Cat, Tiger} Sea Sun
Sky Wave Cat Forest Tiger Grass Keyword

8 Neighborhood Formulation
Q: what is most related conference to ICDM A: RWR! [Sun ICDM2005] Conference Author

9 NF: example

10 Center-Piece Subgraph(CePS)
Q ? Original Graph Black: query nodes CePS A: RWR! [Tong KDD 2006] ICDM2006 Dec, 18-22, HongKong

11 CePS: Example ICDM2006 Dec, 18-22, HongKong

12 Other Applications Content-based Image Retrieval [He]
Personalized PageRank [Jeh], [Widom], [Haveliwala] Anomaly Detection (for node; link) [Sun] Link Prediction [Getoor], [Jensen] Semi-supervised Learning [Zhu], [Zhou] ICDM2006 Dec, 18-22, HongKong

13 Roadmap Background Basic Idea FastRWR Experimental Results Conclusion
RWR: Definitions RWR: Algorithms Basic Idea FastRWR Pre-Compute Stage On-Line Stage Experimental Results Conclusion ICDM2006 Dec, 18-22, HongKong

14 Computing RWR n x 1 n x n n x 1 1 Restart p Starting vector
Ranking vector Adjacent matrix 1 4 3 2 5 6 7 9 10 8 11 12 1 n x 1 n x n n x 1 ICDM2006 Dec, 18-22, HongKong

15 Fast RWR Finds the Root Solution !
Beyond RWR : Maxwell Equation for Web! [Chakrabarti] P-PageRank [Haveliwala] SM Learning [Zhou, Zhu] RL in CBIR [He] RWR [Pan, Sun] PageRank [Haveliwala] Fast RWR Finds the Root Solution ! ICDM2006 Dec, 18-22, HongKong

16 Q: Given query i, how to solve it?

17 OntheFly: Slow on-line response O(mE)
1 4 3 2 5 6 7 9 10 8 11 12 0.13 0.10 0.05 0.08 0.04 0.02 0.03 1 4 3 2 5 6 7 9 10 8 11 12 No pre-computation/ light storage Slow on-line response O(mE) ICDM2006 Dec, 18-22, HongKong

18 PreCompute R: [Haveliwala] 1 4 3 2 5 6 7 9 10 8 11 10 9 12 2 1 8 3 11
0.13 0.10 0.05 0.08 0.04 0.02 0.03 10 9 12 2 1 8 R: 3 11 4 6 5 7 [Haveliwala] ICDM2006 Dec, 18-22, HongKong

19 PreCompute: Fast on-line response Heavy pre-computation/storage cost
1 4 3 2 5 6 7 9 10 8 11 12 0.13 0.10 0.05 0.08 0.04 0.02 0.03 1 4 3 2 5 6 7 9 10 8 11 12 Fast on-line response Heavy pre-computation/storage cost O(n ) 3 O(n ) 2 ICDM2006 Dec, 18-22, HongKong

20 Q: How to Balance? On-line Off-line ICDM2006 Dec, 18-22, HongKong

21 Roadmap Background Basic Idea FastRWR Experimental Results Conclusion
RWR: Definitions RWR: Algorithms Basic Idea FastRWR Pre-Compute Stage On-Line Stage Experimental Results Conclusion ICDM2006 Dec, 18-22, HongKong

22 Basic Idea Find Community Combine Fix the remaining 1 4 3 2 5 6 7 9 10
8 11 12 1 4 3 2 5 6 7 9 10 8 11 12 Find Community 5 6 7 9 10 8 11 12 5 6 7 9 10 8 11 12 1 4 3 2 5 6 7 9 10 8 11 12 0.13 0.10 0.05 0.08 0.04 0.02 0.03 1 4 3 2 1 4 3 2 1 4 3 2 5 6 7 9 10 8 11 12 1 4 3 2 5 6 7 9 10 8 11 12 Combine Fix the remaining

23 Pre-computational stage
-1 Q: A: A few small, instead of ONE BIG, matrices inversions Efficiently compute and store Q ICDM2006 Dec, 18-22, HongKong

24 On-Line Query Stage + Q: Efficiently recover one column of Q
-1 Q: Efficiently recover one column of Q A: A few, instead of MANY, matrix-vector multiplication + ICDM2006 Dec, 18-22, HongKong

25 Roadmap Background Basic Idea FastRWR Experimental Results Conclusion
RWR: Definitions RWR: Algorithms Basic Idea FastRWR Pre-Compute Stage On-Line Stage Experimental Results Conclusion ICDM2006 Dec, 18-22, HongKong

26 Pre-compute Stage p1: B_Lin Decomposition p2: Q matrices
P1.1 partition P1.2 low-rank approximation p2: Q matrices P2.1 computing (for each partition) P2.2 computing (for concept space) ICDM2006 Dec, 18-22, HongKong

27 P1.1: partition Within-partition links cross-partition links 1 4 3 2 5
6 7 9 10 8 11 12 10 9 12 2 8 1 3 11 4 6 5 7 Within-partition links cross-partition links ICDM2006 Dec, 18-22, HongKong

28 P1.1: block-diagonal 1 4 3 2 5 6 7 9 10 8 11 12 10 9 12 2 8 1 3 11 4 6 5 7 ICDM2006 Dec, 18-22, HongKong

29 P1.2: LRA for ~ |S| << |W2| 1 4 3 2 5 6 7 9 10 8 11 12 10 9 12 2
ICDM2006 Dec, 18-22, HongKong

30 = +

31 p2.1 Computing ICDM2006 Dec, 18-22, HongKong

32 Comparing and = Computing Time Storage Cost
100,000 nodes; 100 partitions Computing ,00x is Faster! Storage Cost 100x saving! Q 1,1 1,2 1,k =

33 ~ Q: How to fix the green portions? ~ + ~ + ?

34 p2.2 Computing: -1 Q 1,1 1,2 1,k _ U = V 1 4 3 2 5 6 7 9 10 8 11 12 ICDM2006 Dec, 18-22, HongKong

35 We have: SM Lemma says: Communities Bridges
ICDM2006 Dec, 18-22, HongKong

36 Roadmap Background Basic Idea FastRWR Experimental Results Conclusion
RWR: Definitions RWR: Algorithms Basic Idea FastRWR Pre-Compute Stage On-Line Stage Experimental Results Conclusion ICDM2006 Dec, 18-22, HongKong

37 ? On-Line Stage Q + A (SM lemma) Query Result Pre-Computation
ICDM2006 Dec, 18-22, HongKong

38 On-Line Query Stage q1: q2: q3: q4: q5: q6:
ICDM2006 Dec, 18-22, HongKong

39 ICDM2006 Dec, 18-22, HongKong

40 Roadmap Background Basic Idea FastRWR Experimental Results Conclusion
RWR: Definitions RWR: Algorithms Basic Idea FastRWR Pre-Compute Stage On-Line Stage Experimental Results Conclusion ICDM2006 Dec, 18-22, HongKong

41 Experimental Setup Dataset Approx. Quality: Relative Accuracy
DBLP/authorship Author-Paper 315k nodes 1,800k edges Approx. Quality: Relative Accuracy Application: Center-Piece Subgraph ICDM2006 Dec, 18-22, HongKong

42 Query Time vs. Pre-Compute Time
Log Query Time Quality: 90%+ On-line: Up to 150x speedup Pre-computation: Two orders saving Log Pre-compute Time ICDM2006 Dec, 18-22, HongKong

43 Query Time vs. Pre-Storage
Log Query Time Quality: 90%+ On-line: Up to 150x speedup Pre-storage: Three orders saving Log Storage ICDM2006 Dec, 18-22, HongKong

44 Roadmap Background Basic Idea FastRWR Experimental Results Conclusion
RWR: Definitions RWR: Algorithms Basic Idea FastRWR Pre-Compute Stage On-Line Stage Experimental Results Conclusion ICDM2006 Dec, 18-22, HongKong

45 Conclusion FastRWR More in the paper
Reasonable quality preservation (90%+) 150x speed-up: query time Orders of magnitude saving: pre-compute & storage More in the paper The variant of FastRWR and theoretic justification Implementation details normalization, low-rank approximation, sparse More experiments Other datasets, other applications ICDM2006 Dec, 18-22, HongKong

46 Q&A Thank you! htong@cs.cmu.edu www.cs.cmu.edu/~htong
ICDM2006 Dec, 18-22, HongKong


Download ppt "Fast Random Walk with Restart and Its Applications"

Similar presentations


Ads by Google