Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Direction-Aware Proximity for Graph Mining KDD 2007, San Jose Hanghang Tong, Yehuda Koren, Christos Faloutsos.

Similar presentations


Presentation on theme: "Fast Direction-Aware Proximity for Graph Mining KDD 2007, San Jose Hanghang Tong, Yehuda Koren, Christos Faloutsos."— Presentation transcript:

1 Fast Direction-Aware Proximity for Graph Mining KDD 2007, San Jose Hanghang Tong, Yehuda Koren, Christos Faloutsos

2 2 Defining Direction-Aware Proximity (DAP): escape probability Define Random Walk (RW) on the graph Esc_Prob(A  B) – Prob (starting at A, reaches B before returning to A) Esc_Prob = Pr (smile before cry) A B the remaining graph

3 3 Esc_Prob(1->5) = P= I - + P: Transition matrix (row norm.)

4 Intuition of Formula P*P=

5 5 Esc_Prob(1->5) = P= I - + P: Transition matrix (row norm.)

6 6 Case 1, Medium Size Graph – Matrix inversion is feasible, but… – What if we want many proximities? – Q: How to get all (n ) proximities efficiently? – A: FastAllDAP! Case 2: Large Size Graph – Matrix inversion is infeasible – Q: How to get one proximity efficiently? – A: FastOneDAP! Challenges 2

7 7 FastAllDAP Q1: How to efficiently compute all possible proximities on a medium size graph? – a.k.a. how to efficiently solve multiple linear systems simultaneously? Goal: reduce # of matrix inversions!

8 8 FastAllDAP: Observation Need two different matrix inversions! P=

9 9 FastAllDAP: Rescue Redundancy among different linear systems! P= Overlap between two gray parts! Prox(1  5) Prox(1  6)

10 10 FastAllDAP: Theorem Theorem: Proof: by SM Lemma Example:

11 11 FastAllDAP: Algorithm Alg. – Compute Q – For i,j =1,…, n, compute Computational Save O(1) instead of O(n )! Example – w/ 1000 nodes, – 1m matrix inversion vs. 1 matrix! 2

12 12 FastOneDAP Q1: How to efficiently compute one single proximity on a large size graph? – a.k.a. how to solve one linear system efficiently? Goal: avoid matrix inversion!

13 13 FastOneDAP: Observation Partial Info. (4 elements /2 cols ) of Q is enough!

14 14 FastOneDAP: Observation Q: How to compute one column of Q? A: Taylor expansion Reminder: i col of Q th [0, …0, 1, 0, …, 0] T

15 15 FastOneDAP: Observation xxx Sparse matrix-vector multiplications! …. i col of Q th [0, …0, 1, 0, …, 0] T

16 16 FastOneDAP: Iterative Alg. Alg. to estimate i Col of Q th

17 17 FastOneDAP: Property Convergence Guaranteed ! Computational Save – Example: 100K nodes and 1M edges (50 Iterations) 10,000,000x fast! Footnote: 1 col is enough! – (details in paper)

18 18 Esc_Prob is good, but… Issue #1: – `Degree-1 node’ effect Issue #2: – Weakly connected pair Need some practical modifications!

19 19 Issue#1: `degree-1 node’ effect [Faloutsos+] [Koren+] no influence for degree-1 nodes (E, F)! – known as ‘pizza delivery guy’ problem in undirected graph Solutions: Universal Absorbing Boundary! Esc_Prob(a->b)=1

20 20 Universal Absorbing Boundary U-A-B is a black-hole! Footnote: fly-out probability = 0.1

21 21 Introducing Universal-Absorbing-Boundary Prox(a->b)=0.91 Prox(a->b)=0.74 Footnote: fly-out probability = 0.1 Esc_Prob(a->b)=1

22 22 Issue#2: Weakly connected pair Prox(A  B) = Prox (B  A)=0 Solution: Partial symmetry!

23 23 Practical Modifications: Partial Symmetry Prox(A  B) = Prox (B  A)=0 Prox(A  B) =0.081 > Prox (B  A)=0.009

24 24 Efficiency: FastAllDAP Size of Graph Time (sec) Straight-Solver FastAllDAP 1,000x faster!

25 25 Efficiency: FastOneDAP Size of Graph Time (sec) FastOneDAP Straight-Solver 1,0000x faster!

26 26 Link Prediction: existence DatasetAccuracy DAPUDAP WL65.40% PC79.60%80.78% AE81.51%80.60% CN86.71%84.00% EP92.21%92.09%

27 27 Link Prediction: direction Q: Given the existence of the link, what is the direction of the link? A: Compare prox(i  j) and prox(j  i) >70% Prox (i  j) - Prox (j  i) density

28


Download ppt "Fast Direction-Aware Proximity for Graph Mining KDD 2007, San Jose Hanghang Tong, Yehuda Koren, Christos Faloutsos."

Similar presentations


Ads by Google